← Cookbook
RAG with text-embedding-3-large + Qwen Plus
Launch embedding model + value-tier Chinese chat model for searchable documents.
Architecture
Documents → text-embedding-3-large vectors → store in any vector DB → at query time embed the question → top-k retrieval → feed to Qwen Plus for the answer.
Embed your corpus
Use the OpenAI Python SDK with SeaLink's base URL. text-embedding-3-large returns 3072-dim vectors and is the launch embedding path.
embed_corpus.py
from openai import OpenAIclient = OpenAI(base_url="https://api.sealink.asia/v1",api_key="<your-sealink-key>",)docs = ["Document 1...", "เอกสาร 2...", "文档 3..."]res = client.embeddings.create(model="text-embedding-3-large", input=docs)# Each embedding has 3072 dimensionsvectors = [d.embedding for d in res.data]# Now insert into pgvector / Qdrant / Weaviate / etc.
Answer with Qwen Plus
After retrieving top-k chunks, build a prompt and call Qwen Plus. It handles Chinese / English / SEA languages natively.
answer.py
def answer(question, retrieved_chunks):context = "\n\n".join(retrieved_chunks)prompt = f"""Answer the question using only the context below. If the answer isn't there, say so.Context:{context}Question: {question}Answer:"""resp = client.chat.completions.create(model="qwen3-plus",messages=[{"role": "user", "content": prompt}],max_tokens=600,)return resp.choices[0].message.content
Cost ballpark
10K queries / month × (300 tokens RAG context + 200 tokens answer) on Qwen Plus ≈ $1.30. text-embedding-3-large adds about $0.13 per million input tokens.