Boston Azure June 2024

Posted by Jason on Tuesday, June 25, 2024

Last night was the Season of AI presentation. We started with Bill Wilder presenting the fundamentals of Generative AI and quick introduction to Azure AI Studio, then I finished up with a .NET code walkthrough implement Retrieval Augmented Generation (RAG) using Semantic Kernel.

It was nice to see a lot of regular faces and meet several new people.

Demo Code

The demo code is on my GitHub repo BostonAzure-June2024 under a subdirectory.


The code is setup as the beginning of the demo (ie. simple echo client/api implementation), you’ll find the steps I used to progressively create the demo in the file.

Since I ran of time to do the last “bonus step”, you’ll find it at the end of that script along with the full content of the final code (shown below):

using Microsoft.ML.Tokenizers;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using Microsoft.SemanticKernel.Text;
using System.Collections.Frozen;
using System.Text;

var builder = WebApplication.CreateBuilder(args);

//	.AddOpenAIChatCompletion("gpt-4o", builder.Configuration["AI:OpenAI:ApiKey"])
//	.AddOpenAITextEmbeddingGeneration("text-embedding-ada-002", builder.Configuration["AI:OpenAI:ApiKey"]);

	.AddOpenAIChatCompletion("gpt-4o", builder.Configuration["AI:OpenAI:ApiKey"], null, null, new HttpClient(new RequestAndResponseLoggingHttpClientHandler()))
	.AddOpenAITextEmbeddingGeneration("text-embedding-ada-002", builder.Configuration["AI:OpenAI:ApiKey"], null, null, new HttpClient(new RequestLoggingHttpClientHandler()));

var app = builder.Build();

// Step 2: Text Chunking
var code = File.ReadAllLines(@"transcript.txt");
var tokenizer = Tiktoken.CreateTiktokenForModel("gpt-4o");
var chunks = TextChunker.SplitPlainTextParagraphs(code, 500, 100, null, text => tokenizer.CountTokens(text));

// Step 3: Vector Store
var embeddingService = app.Services.GetRequiredService<ITextEmbeddingGenerationService>();

var memoryBuilder = new MemoryBuilder();
memoryBuilder.WithMemoryStore(new VolatileMemoryStore());

var memory = memoryBuilder.Build();

for (int i = 0; i < 10; i++)
	await memory.SaveInformationAsync("chunks", id: i.ToString(), text: chunks[i]);

app.MapGet("/copilot", async (string question, Kernel kernel) =>
	// Step 4:  Search the Vector Store
	var results = await memory.SearchAsync("chunks", question, 10, 0.6).ToListAsync();

	var context = new StringBuilder();

	int tokensRemaining = 2000;
	foreach (var result in results)
		// Keep Prompt under specific size
		if ((tokensRemaining -= tokenizer.CountTokens(result.Metadata.Text)) < 0)

		System.Console.WriteLine($"Search Result: {result.Relevance.ToString("P")}");

		//	prompt.AppendLine(result.Metadata.Text);

	var prompts = kernel.CreatePluginFromPromptDirectory("Prompts");
	return prompts["RAG"].InvokeStreamingAsync<string>(kernel, new KernelArguments()
			{ "question", question },
			{ "context", context.ToString() }



References and other learning materials

Here are some Semantic Kernel references that you may find useful:

If you have a comment, please message me @haleyjason on twitter/X.