Linq's AI Retrieval Model Achieves the Top Spot on the HuggingFace MTEB Leaderboard

BOSTON, June 5, 2024 /PRNewswire/ -- Linq, a generative AI startup, announced that its large embedding model "Linq-Embed-Mistral" ranked first in the text retrieval evaluation on HuggingFace's "Massive Text Embedding Benchmark (MTEB)" leaderboard, outpacing competitors like NVIDIA, Salesforce, Google, OpenAI, and Cohere. This evaluation is run by HuggingFace, the world's largest machine learning platform.

Linq's embedding model achieved a score of 60.2 points in the text retrieval category, securing the top position. This placed Linq ahead of NVIDIA, which scored 59.4 points, and Voyage AI, which scored 58.3 points. Google's model followed with a score of 55.7, while OpenAI and Cohere scored 55.4 and 55.0 points, respectively.

The MTEB leaderboard by HuggingFace ranks the performance of embedding models across seven categories, including classification, clustering, pair classification, reranking, retrieval, semantic textual similarity (STS), and summarization. Linq's embedding model demonstrated excellent performance not only in the text retrieval category but also in other categories, earning an overall rank of third.

The MTEB lists more than 300 embedding models, highlighting the competitive yet manageable landscape of embedding model technology. Linq's top performance in this specific benchmark underscores its superiority in embedding model technology.

Embedding models are critical in generative AI, particularly for addressing the hallucination problem of large language models (LLMs) by employing retrieval-augmented generation (RAG) technology. RAG allows models to produce reliable outputs by accessing the latest data or internal documents not available within the LLM.

Leading this project, Dr. Junseong Kim stated, "Our research demonstrates that due to the broad topic diversity and challenging difficulty of retrieval data, GPT-generated data is not perfect and requires thorough verification and refinement. Through these processes, we can achieve quality comparable to human-labeled data, ultimately attaining the best retrieval performance based on the MTEB benchmark dataset. This study shows that through elaborate data crafting and filtering using GPT, we can create models optimized for retrieval-augmented generation (RAG) and maximize performance in specific fields." Additionally, he emphasized, "Not only is refined data crucial, but optimized training methodologies and rapid experimental cycles are also key to maximizing retrieval performance."

Linq's Co-founder & CEO, Jacob Choi, emphasized, "Accurate search is crucial for generative AI enterprises' adoption. We're proud to have developed the core embedding model to achieve this, and we'll keep expanding and refining it to ensure precise text searches in specialized fields like finance and legal." Choi noted that while 2023 saw the rise of B2C use cases for generative AI with the advent of ChatGPT, 2024 will witness the growth of B2B (business-to-business) applications with improved accuracy and security technologies.

Massive Text Embedding Benchmark (MTEB) BEIR Retrieval Score in HuggingFace. as of May 30, 2024.

[Company Description]

Founded in 2022, Linq (Wecover Platforms Inc) was established by MIT Electrical and Computer Engineering graduate Jacob Choi and MIT Computational Science and Engineering Ph.D. Subeen Pang. In 2021, Choi was named in Forbes' "30 Under 30" in the science category for his AI neuromorphic computing research. Linq received early investments from KakaoVentures, Smilegate Investment, and Yellowdog in 2022. In 2023, Linq won the Samsung Open Collaboration hosted by Samsung Financial Networks and was selected for MassChallenge Fintech cohort, the largest non-equity accelerator in the U.S., continuing its collaboration with KPMG US.

Contact: Jacob Choi (jacob.choi@getlinq.com)

source: Linq (Wecover Platforms Inc)

1	【大行炒Ｄ乜】旺旺績後大行齊削目標，高盛降蔚來評級至「賣出」
2	《盤前攻略》中東局勢趨穩道指續破頂，恒指試穿萬九劏牛機會高
3	恒指半日倒升９３點報１９２４４，中概反彈百度漲近６％
4	《中概異動》百度漲半成，在港啟用ＡＩ創新中心及試水蘿蔔快跑
5	《盤後部署》內地新一輪消費券料好打有限，貓眼等跌至百天線上車
6	李家超一行到佛山考察，晤佛山市委書記唐屹峰，稱保持緊密合作
7	【歐盟加稅】歐盟官員否認短期內與中國就關稅替代方案達協議
8	《大手成交》騰訊等多家公司上午９：０５大手交易
9	滙豐控股（００００５）昨日在港斥２﹒２６億元回購３１６萬股
10	國藥控股（０１０９９）－股權變動紀錄

1	《品中資－羅國森》四大內銀，「市值管理」未及格
2	《缸邊麗評－熊麗萍》市值管理出爐齊尋寶，中特估板塊堪留意
3	《窩輪豪情－梁業豪》醞釀反彈，惟須提防往下突破
4	《投資心得－潘鐵珊》滙控簡化組織架構，有助提高決策效率
5	《專家之言－葉尚志》市場氣氛轉弱，港股向下伸延
6	《缸邊隨筆－石鏡泉》８０年來的金╱油比率
7	《法證攻防－林恩》小米瑞聲逆市升，阿里暫錄七連跌
8	《陶冬天下－陶冬》老Ｋ入閣，特朗普政府不那麼瘋狂了
9	《陸言堂－陳永陸》地緣政局為全球市場帶來波動，黃金成避險佳選
10	《真知灼見－溫灼培》提防歐美通脹重回

1	高息定存 \| 工銀亞洲3個月存息加至3.6厘，華僑調整快閃優惠
2	基建債券 \| 基建債券明開售保底息3.5厘，專家建議抽幾多手? 一文看清認購優惠！
3	李家超下周率工商界代表團訪大灣區，促進經貿合作
4	順豐上市 \| 順豐招股溫和料見增長瓶頸 A股提前派息慶功欠誠意
5	港股 \| 午市前瞻 \| 金監局新指示恒指跌幅擴大百度優勢大惟變現需時
6	NVIDIA \| 陳茂波與黃仁勳等到深水埗品嚐地道小炒，交流創科發展經驗
7	大國博弈 \| 【FOCUS】油金股匯冷看「蘑菇雲」，惟普京底牌不止於此
8	港股 \| 蕭猷華：恒指料下試19000，惟下跌空間有限
9	高息定存 \| 10萬元起存3個月最高3.9厘，渣打加至3.6厘
10	提振A股 \| 高盛：繼續給予A股市場「高配」建議

1	高息定存 \| 銀行紛搶存，恒生3個月加至3.6厘，創興高達3.9厘
2	高息定存 \| 中銀上調3個月至3.6厘，東亞新增至尊理財定存
3	美國大選2024 \| 2024美國大選即時結果，特朗普宣布勝利
4	理財通 \| 證監會：首批試點計劃券商名單出爐，續優化擴大理財通
5	恒指公司與沙特交易所簽署合作意向協議書，探索產品開發等
6	內地救市見效樓市有起色，惟再有內房抽水可以點揀？
7	港股 \| 蕭猷華：重磅消息來襲，股市勢必波動
8	美國大選2024 \|【FOCUS】侵侵勝券在握，防美元反高潮
9	瀚亞專家投資智慧：市場動盪下，低波幅如何成為避險關鍵？
10	美國大選 \| 【FOCUS】「垃圾」牽動選票，美媒各有盤算
11	高息定存 \| 一周高息合集，多家銀行加定存息，華僑3個月最高4厘
12	高息定存 \| 創興加3個月存息至3.6厘，渣打6個月3.48厘
13	高息定存 \| 特朗普勝選美元走強，富邦一個月美元定存5.98厘
14	港股 \| 午市前瞻 \| 人行買斷式逆回購刺激料有限內房板塊短線向好可吼
15	美國大選 \| 法國外貿銀行：若60%關稅屬實，損內地GDP增長率1百分點
16	恒指 \| 恒指午後升逾300點，人大常委開會期間中資金融股造好
17	高息定存 \| 一周高息合集，銀行6個月最高3.6厘，3個月4厘
18	高息定存 \| 工銀亞洲3個月存息加至3.6厘，華僑調整快閃優惠
19	2025 多元資產部署解鎖環球股匯債市潛力
20	神州經脈 \| 6萬億化債政策出台，滬指全周升逾5%，人幣跌
21	專訪 \| 洪灝：情緒不等於信心，市場關注人大會議勿捉錯用神（有片）
22	無人機 \| 美團：冀借助港府推動低空經濟，盡快拓香港無人機配送服務
23	基建債券 \| 基建債券明開售保底息3.5厘，專家建議抽幾多手? 一文看清認購優惠！
24	大家樂牛油 \| 大家樂否認轉用內地牛油，澄清荷蘭生產自家品牌維寶牛油醬
25	美國減息 \| 【FOCUS】減息減了個寂寞，鮑公茫然下一步
26	【FOCUS】「X治國」2.0啟幕，新舊媒體權力交鋒
27	澳門派錢 \| 澳門明年度預算案提出續推現金分享等惠民措施
28	【FOCUS】國產機鬥內捲，小米鮎魚上身
29	高息定存 \| 渣打3個月存息減至3.3厘，虛銀逆市加至3.5厘
30	滙控 \| 季績勝預期兼續回購，獲大行唱好股價創17年高，可以點部署？

大國博弈

歐洲重回20分鐘解決矛盾的時代

貨幣攻略

高息定存 | 10萬元起存3個月最高3.9厘，渣打加至3.6...

傾力救市

內險股 | 金監總局促保險業轉變發展模式，內險股股價持續受壓...

說說心理話