Bing Chat Becomes More Responsive with New Backend

Diposting pada

Bing Chat, the AI-powered chatbot from Microsoft, recently underwent some major upgrades. This change results 25% latency reduction for multiple queries, making the experience more responsive and user-friendly. Improvements were made to the technology that powers Bing Chat, Microsoft’s AI search engine. The new backend is more efficient and uses less resources, which leads to reduced latency.

Changes to Bing Chat were announced on Twitter by Mikhail Parakhin, CEO of Bing. He says the upgrade is “a completely reworked backend for internal monologue, reducing time to first token by ~25%, and, much more importantly, making latency more stable, reducing spikes.”

Michael Schechter, a product manager at Bing, also commented about the change. He said the increased latency “represents a lot of work and a significant improvement to the overall experience.”

Fun fact: internally, we are very passionate about something that the majority of people find boring. Yesterday we released a completely reworked backend for internal monolog, reducing time to first token by ~25%, and, much more importantly, making latency more stable, reducing spikes:

— Mikhail Parakhin (@MParakhin) June 29, 2023

How Bing Chat Delivers AI Search

Microsoft uses its own technologies such as Microsoft Graph as well as OpenAI’s GPT big language model in Bing Chat. In March, this included an upgrade to GPT-4. Bing Chat has been in development for several years and adding GPT capabilities in 2022 accelerates the project. Microsoft built the Prometheus platform to support this experience.

So basically Prometheus is Bing search coupled with ChatGPT natural language processing. Jordi Ribas, Bing’s head of engineering, points out that the combination allows chatbots to be more accurate:

“Thanks to Bing’s cornerstone techniques, Prometheus can also integrate quotes into sentences in Chat answers so that users can easily click through to access the resource and verify the information. Sending traffic to these sources is important for a healthy web ecosystem and remains one of our main goals for Bing.”

Bing Chat: What’s Behind

Bing Chat uses a mixed approach of rules-based and neural components to handle different types of user requests. For example, if a user is looking for information, Bing will perform a web search and provide a factual statement with references and links.

If a user is looking for creative content, such as poetry, stories, code, essays, songs, or parodies of celebrities, Bing will create it using its own words and knowledge. If users need help rewriting, improving, or optimizing their content, Bing Chat will help them too. If users want to have fun or learn something new, Bing Chat will offer jokes, trivia, games or educational content.

To provide this capability, Bing Chat leverages Azure Cognitive Services, such as LUIS, QnA maker, Text Analysis, and Speech Services. This service enables Bing Chat to understand user intent and context, answer their questions, analyze their sentiments, and recognize their speech. Bing Chat also uses Azure Machine Learning and Azure Databricks to train and deploy custom models for content creation, summaries, paraphrasing, and rewriting tasks.

To improve its scalability, reliability and performance, Bing Chat has adopted a microservices architecture and a serverless computing paradigm. It uses Azure Functions, Azure Service BusAzure Event Grid, and Azure Cosmos DB to manage data flow and requests between various services and components.


Thus the article about Bing Chat Becomes More Responsive with New Backend
I hope the information in the article is useful to you. Thank you for taking the time to visit this blog. If there are suggestions and criticisms, please contact us :