SeaLink
Sign up
← Cookbook

RAG with text-embedding-3-large + Qwen Plus

Launch embedding model + value-tier Chinese chat model for searchable documents.

Architecture

Documents → text-embedding-3-large vectors → store in any vector DB → at query time embed the question → top-k retrieval → feed to Qwen Plus for the answer.

Embed your corpus

Use the OpenAI Python SDK with SeaLink's base URL. text-embedding-3-large returns 3072-dim vectors and is the launch embedding path.

embed_corpus.py
from openai import OpenAI
client = OpenAI(
base_url="https://api.sealink.asia/v1",
api_key="<your-sealink-key>",
)
docs = ["Document 1...", "เอกสาร 2...", "文档 3..."]
res = client.embeddings.create(model="text-embedding-3-large", input=docs)
# Each embedding has 3072 dimensions
vectors = [d.embedding for d in res.data]
# Now insert into pgvector / Qdrant / Weaviate / etc.

Answer with Qwen Plus

After retrieving top-k chunks, build a prompt and call Qwen Plus. It handles Chinese / English / SEA languages natively.

answer.py
def answer(question, retrieved_chunks):
context = "\n\n".join(retrieved_chunks)
prompt = f"""Answer the question using only the context below. If the answer isn't there, say so.
Context:
{context}
Question: {question}
Answer:"""
resp = client.chat.completions.create(
model="qwen3-plus",
messages=[{"role": "user", "content": prompt}],
max_tokens=600,
)
return resp.choices[0].message.content

Cost ballpark

10K queries / month × (300 tokens RAG context + 200 tokens answer) on Qwen Plus ≈ $1.30. text-embedding-3-large adds about $0.13 per million input tokens.