AI Translation Algorithm Engineer AI翻译算法工程师

Bitget

Bitget

Software Engineering, Data Science
Posted on Sep 27, 2025

About us

Bitget is one of the world's leading digital assets ecosystems. With over 120 million registered users, Bitget has one of the most comprehensive suites of blockchain products and services available via bitget.com.

Our mission is to support the growth of the digital assets industry and we believe it represents the future of finance. What we do empowers the future of finance by ensuring secure, efficient and smart digital transactions.

We are one of the fastest growing companies in the digital asset sector. If you are looking for cutting-edge work, where you will have opportunities to develop your career among peers who are experts in their field, and you believe in the future of digital currency, then look no further than Bitget!

What you'll do

  • Responsible for the overall design and iteration of the multilingual translation core, covering data processing, model training, inference framework, service deployment and performance optimization, achieving a balance among quality, latency and cost.
  • Work together with the company's business domain and the localization team to build and maintain a terminology table (Termbase), bilingual dictionaries and translation memory (TM), achieving strong term constraints and consistency control, and connecting the human language expert workflow and MTPE process.
  • Design the multi-source knowledge-enhanced translation (RAG for MT) solution: indexing, searching and context injection of knowledge bases such as white papers, protocol specifications, market/chain data, brand copy and FAQs, to improve the accuracy and stability of entity and proper noun translations in the domain.
  • Responsible for the exploration and implementation of the hybrid architecture of large models and NMT: including the collaboration of multilingual Transformer/NMT (such as Marian/NLLB/M2M100) and LLM (such as LLaMA/Qwen/mT5), instruction fine-tuning (SFT), preference optimization (DPO/ORPO/reward modeling), knowledge distillation, Adapter/LoRA multi-task training, etc.
  • Build a data pipeline: bilingual corpus mining, quality assessment and cleaning (deduplication, alignment, sentence segmentation, regularization), noise modeling, alignment enhancement, automatic term annotation, incremental update and data loop.
  • Establish an automated evaluation and human evaluation process: offline indicators such as BLEU/chrF/COMET/BERTScore, combined with term consistency scoring in business scenarios, omission translation detection, compliance sensitive word identification, style consistency and readability scoring; build online A/B and gray-scale experiments.
  • Inference and system optimization: quantization (INT8/FP8), pruning, KV caching, tensor parallel/flow parallel, graph compilation (TensorRT/ONNX/FasterTransformer), high concurrency services (Triton/vLLM), SLA and elastic scalability (K8s/Ray), achieving low latency and high throughput.
  • Collaborate with product, front-end, back-end, content and compliance teams to provide translation capabilities and solution plans for various business scenarios.
  • Pay attention to the latest progress and quickly pilot: multilingual large models, retrieval enhancement, structured prompt engineering, constrained decoding (Lexically Constrained Decoding), controllable generation, soft/hard term constraints, online learning and human in-loop feedback (HITL).

What you'll need

  • Educational Background: Graduated from a professional program in computer science, artificial intelligence, natural language processing, computational linguistics, etc. A master's degree or above is preferred.
  • Work Experience: 5-10 years of industry experience in NLP, machine translation, multilingual models; possess a complete set of practical experience from scratch to large-scale implementation.

Technical Skills:

  • Solid understanding of machine translation and large model principles: Transformer, multilingual NMT, alignment and segmentation (SentencePiece/BPE), contrastive learning, instruction fine-tuning and preference optimization, distillation and low-rank adaptation, retrieval enhancement and knowledge injection.
  • Evaluation and data governance: Familiar with metrics such as BLEU/ChrF/COMET/BERT and human evaluation standards; Master the methodologies for bilingual data cleaning, noise removal, alignment, domain adaptation, and terminology consistency control.
  • Engineering Stack: Proficient in PyTorch/JAX, familiar with Transformers/DeepSpeed/accelerate; Master K8s, Docker, CI/CD, monitoring and logging; Excellent code quality and documentation skills.

Domain Experience:

  • Have experience in translating content for cryptography/blockchain/financial technology or professional term implementation, understand common tokens, protocol types and industry contexts; Have a basic understanding of internationalization and compliance (personal information, sensitive words and regional compliance).
  • Comprehensive Qualities: Strong ability to decompose problems and design experiments, result-oriented, able to advance in uncertainty; Good cross-team communication and bilingual communication skills, capable of conducting technical solution reviews and knowledge output.

Bonus Points

  • Have published papers, participated in shared tasks, or given presentations in WMT/IWSLT machine translation evaluations, ACL/EMNLP/NAACL/NeurIPS conferences/journals;
  • Led or participated in the launch and continuous operation of multi-language translation systems (≥ 20 languages) in production environments;
  • Have systematic methods and reusable components for LLM in generating translation constraints, term alignment, explainability of error types, and error correction links in translation scenarios;
  • Familiar with the construction of end-to-end corpus pipelines: web scraping, deduplication, mixed scoring with rules and models, BT/BT-Tagging, QE (Quality Estimation), etc.;
  • Have a thorough understanding and practical experience in AI Coding/Vibe Coding.

岗位职责:

  • 负责多语言翻译内核的总体方案设计与迭代,覆盖数据处理、模型训练、推理框架、服务化部署与性能优化,达成质量、时延、成本三者的平衡。
  • 基于公司业务域和本地化团队一起构建并维护术语表(Termbase)、双语词典与翻译记忆(TM),实现术语强约束与一致性控制,并对接人类语言专家工作流与MTPE流程。
  • 设计多源知识增强翻译(RAG for MT)方案:将白皮书、协议规范、行情/链上数据、品牌文案与FAQ等知识库进行索引、检索与上下文注入,提高领域内实体与专有名词翻译的准确性与稳定性。 - 负责大模型与NMT的混合架构探索与落地:包括多语种Transformer/NMT(如Marian/NLLB/M2M100)与LLM(如LLaMA/Qwen/mT5等)协同、指令微调(SFT)、偏好优化(DPO/ORPO/奖励建模)、知识蒸馏、Adapter/LoRA多任务训练等。
  • 构建数据管线:双语语料挖掘、质量评估与清洗(去重、对齐、分句、正则化)、噪声建模、对齐增强、术语自动标注、增量更新与数据闭环。
  • 建立自动化评测与人评流程:BLEU/chrF/COMET/BERTScore等离线指标,结合业务场景的术语一致性评分、禁漏译检测、合规敏感词识别、风格一致性与可读性评分;搭建在线A/B与灰度实验。
  • 推理与系统优化:量化(INT8/FP8)、剪枝、KV缓存、张量并行/流水并行、图编译(TensorRT/ONNX/FasterTransformer)、高并发服务(Triton/vLLM),SLA与弹性伸缩(K8s/Ray),实现低时延与高吞吐。
  • 与产品、前端、后端、内容与合规团队协作,面向多种业务场景提供翻译能力与落地方案。 - 关注前沿进展并快速试点:多语大模型、检索增强、结构化提示工程、约束解码(Lexically Constrained Decoding)、可控生成、术语软/硬约束、在线学习与人类在环反馈(HITL)。

任职要求:

  • 学历背景:计算机/人工智能/自然语言处理/计算语言学等相关专业科班出身,硕士及以上优先。
  • 工作经验:5-10年NLP/机器翻译/多语言模型相关工业界经验;具备从0-1到规模化落地的完整实践。

技术能力:

  • 机器翻译与大模型原理扎实:Transformer、多语NMT、对齐与分词(SentencePiece/BPE)、对比学习、指令微调与偏好优化、蒸馏与低秩适配、检索增强与知识注入。
  • 评测与数据治理:熟悉BLEU/chrF/COMET/Bert等指标与人评标准;掌握双语数据清洗、去噪、对齐、域适配与术语一致性控制的方法论。
  • 工程栈:熟练使用PyTorch/JAX,熟悉Transformers/DeepSpeed/accelerate;掌握K8s、Docker、CI/CD、监控与日志;良好的代码质量与文档能力。

领域经验:

  • 有加密/区块链/金融科技内容翻译或专业术语落地经验,了解常见代币、协议类型与行业语境;对国际化与合规有基本认识(个人信息、敏感词与区域合规);
  • 综合素质:强问题分解与实验设计能力,结果导向,能在不确定性中推进;良好的跨团队沟通与中英沟通能力,能进行技术方案评审与知识输出。

加分项

  • 在WMT/IWSLT等机器翻译评测、ACL/EMNLP/NAACL/NeurIPS等会议/期刊有论文、共享任务或演讲记录;
  • 主导或核心参与过多语种翻译系统(≥20语种)在生产环境的上线与持续运营;
  • 对LLM在翻译场景的约束生成、术语对齐、错误类型可解释与纠错链路有系统方法与复用组件;
  • 熟悉端到端的语料管道搭建:网络爬取、去重、规则与模型混合打分、BT/BT-Tagging、QE(Quality Estimation)等;
  • 对AI Coding/Vibe Coding有充分的认知和实践经验。

Why Bitget?

  • Bitget is the world's leading web 3 platform for copy trading and one of the world's largest and most respected exchanges
  • We are a global company with staff members from over 50 different countries and regions
  • We are growing and looking for world-class ambitious talents to help us continue this journey
  • We have a streamlined structure that empowers employees to work efficiently, delivering the best results in a short timeframe
  • We offer competitive salaries and benefits
  • Blockchain technology and digital assets have the potential to change finance in a way no other technology can - be part of it!

If you are ambitious and believe that digital assets could be the next financial and technological revolution, please apply!

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.

About us

Bitget is one of the world's leading digital assets ecosystems. With over 120 million registered users, Bitget has one of the most comprehensive suites of blockchain products and services available via bitget.com.

Our mission is to support the growth of the digital assets industry and we believe it represents the future of finance. What we do empowers the future of finance by ensuring secure, efficient and smart digital transactions.

We are one of the fastest growing companies in the digital asset sector. If you are looking for cutting-edge work, where you will have opportunities to develop your career among peers who are experts in their field, and you believe in the future of digital currency, then look no further than Bitget!

What you'll do

  • Responsible for the overall design and iteration of the multilingual translation core, covering data processing, model training, inference framework, service deployment and performance optimization, achieving a balance among quality, latency and cost.
  • Work together with the company's business domain and the localization team to build and maintain a terminology table (Termbase), bilingual dictionaries and translation memory (TM), achieving strong term constraints and consistency control, and connecting the human language expert workflow and MTPE process.
  • Design the multi-source knowledge-enhanced translation (RAG for MT) solution: indexing, searching and context injection of knowledge bases such as white papers, protocol specifications, market/chain data, brand copy and FAQs, to improve the accuracy and stability of entity and proper noun translations in the domain.
  • Responsible for the exploration and implementation of the hybrid architecture of large models and NMT: including the collaboration of multilingual Transformer/NMT (such as Marian/NLLB/M2M100) and LLM (such as LLaMA/Qwen/mT5), instruction fine-tuning (SFT), preference optimization (DPO/ORPO/reward modeling), knowledge distillation, Adapter/LoRA multi-task training, etc.
  • Build a data pipeline: bilingual corpus mining, quality assessment and cleaning (deduplication, alignment, sentence segmentation, regularization), noise modeling, alignment enhancement, automatic term annotation, incremental update and data loop.
  • Establish an automated evaluation and human evaluation process: offline indicators such as BLEU/chrF/COMET/BERTScore, combined with term consistency scoring in business scenarios, omission translation detection, compliance sensitive word identification, style consistency and readability scoring; build online A/B and gray-scale experiments.
  • Inference and system optimization: quantization (INT8/FP8), pruning, KV caching, tensor parallel/flow parallel, graph compilation (TensorRT/ONNX/FasterTransformer), high concurrency services (Triton/vLLM), SLA and elastic scalability (K8s/Ray), achieving low latency and high throughput.
  • Collaborate with product, front-end, back-end, content and compliance teams to provide translation capabilities and solution plans for various business scenarios.
  • Pay attention to the latest progress and quickly pilot: multilingual large models, retrieval enhancement, structured prompt engineering, constrained decoding (Lexically Constrained Decoding), controllable generation, soft/hard term constraints, online learning and human in-loop feedback (HITL).

What you'll need

  • Educational Background: Graduated from a professional program in computer science, artificial intelligence, natural language processing, computational linguistics, etc. A master's degree or above is preferred.
  • Work Experience: 5-10 years of industry experience in NLP, machine translation, multilingual models; possess a complete set of practical experience from scratch to large-scale implementation.

Technical Skills:

  • Solid understanding of machine translation and large model principles: Transformer, multilingual NMT, alignment and segmentation (SentencePiece/BPE), contrastive learning, instruction fine-tuning and preference optimization, distillation and low-rank adaptation, retrieval enhancement and knowledge injection.
  • Evaluation and data governance: Familiar with metrics such as BLEU/ChrF/COMET/BERT and human evaluation standards; Master the methodologies for bilingual data cleaning, noise removal, alignment, domain adaptation, and terminology consistency control.
  • Engineering Stack: Proficient in PyTorch/JAX, familiar with Transformers/DeepSpeed/accelerate; Master K8s, Docker, CI/CD, monitoring and logging; Excellent code quality and documentation skills.

Domain Experience:

  • Have experience in translating content for cryptography/blockchain/financial technology or professional term implementation, understand common tokens, protocol types and industry contexts; Have a basic understanding of internationalization and compliance (personal information, sensitive words and regional compliance).
  • Comprehensive Qualities: Strong ability to decompose problems and design experiments, result-oriented, able to advance in uncertainty; Good cross-team communication and bilingual communication skills, capable of conducting technical solution reviews and knowledge output.

Bonus Points

  • Have published papers, participated in shared tasks, or given presentations in WMT/IWSLT machine translation evaluations, ACL/EMNLP/NAACL/NeurIPS conferences/journals;
  • Led or participated in the launch and continuous operation of multi-language translation systems (≥ 20 languages) in production environments;
  • Have systematic methods and reusable components for LLM in generating translation constraints, term alignment, explainability of error types, and error correction links in translation scenarios;
  • Familiar with the construction of end-to-end corpus pipelines: web scraping, deduplication, mixed scoring with rules and models, BT/BT-Tagging, QE (Quality Estimation), etc.;
  • Have a thorough understanding and practical experience in AI Coding/Vibe Coding.

岗位职责:

  • 负责多语言翻译内核的总体方案设计与迭代,覆盖数据处理、模型训练、推理框架、服务化部署与性能优化,达成质量、时延、成本三者的平衡。
  • 基于公司业务域和本地化团队一起构建并维护术语表(Termbase)、双语词典与翻译记忆(TM),实现术语强约束与一致性控制,并对接人类语言专家工作流与MTPE流程。
  • 设计多源知识增强翻译(RAG for MT)方案:将白皮书、协议规范、行情/链上数据、品牌文案与FAQ等知识库进行索引、检索与上下文注入,提高领域内实体与专有名词翻译的准确性与稳定性。 - 负责大模型与NMT的混合架构探索与落地:包括多语种Transformer/NMT(如Marian/NLLB/M2M100)与LLM(如LLaMA/Qwen/mT5等)协同、指令微调(SFT)、偏好优化(DPO/ORPO/奖励建模)、知识蒸馏、Adapter/LoRA多任务训练等。
  • 构建数据管线:双语语料挖掘、质量评估与清洗(去重、对齐、分句、正则化)、噪声建模、对齐增强、术语自动标注、增量更新与数据闭环。
  • 建立自动化评测与人评流程:BLEU/chrF/COMET/BERTScore等离线指标,结合业务场景的术语一致性评分、禁漏译检测、合规敏感词识别、风格一致性与可读性评分;搭建在线A/B与灰度实验。
  • 推理与系统优化:量化(INT8/FP8)、剪枝、KV缓存、张量并行/流水并行、图编译(TensorRT/ONNX/FasterTransformer)、高并发服务(Triton/vLLM),SLA与弹性伸缩(K8s/Ray),实现低时延与高吞吐。
  • 与产品、前端、后端、内容与合规团队协作,面向多种业务场景提供翻译能力与落地方案。 - 关注前沿进展并快速试点:多语大模型、检索增强、结构化提示工程、约束解码(Lexically Constrained Decoding)、可控生成、术语软/硬约束、在线学习与人类在环反馈(HITL)。

任职要求:

  • 学历背景:计算机/人工智能/自然语言处理/计算语言学等相关专业科班出身,硕士及以上优先。
  • 工作经验:5-10年NLP/机器翻译/多语言模型相关工业界经验;具备从0-1到规模化落地的完整实践。

技术能力:

  • 机器翻译与大模型原理扎实:Transformer、多语NMT、对齐与分词(SentencePiece/BPE)、对比学习、指令微调与偏好优化、蒸馏与低秩适配、检索增强与知识注入。
  • 评测与数据治理:熟悉BLEU/chrF/COMET/Bert等指标与人评标准;掌握双语数据清洗、去噪、对齐、域适配与术语一致性控制的方法论。
  • 工程栈:熟练使用PyTorch/JAX,熟悉Transformers/DeepSpeed/accelerate;掌握K8s、Docker、CI/CD、监控与日志;良好的代码质量与文档能力。

领域经验:

  • 有加密/区块链/金融科技内容翻译或专业术语落地经验,了解常见代币、协议类型与行业语境;对国际化与合规有基本认识(个人信息、敏感词与区域合规);
  • 综合素质:强问题分解与实验设计能力,结果导向,能在不确定性中推进;良好的跨团队沟通与中英沟通能力,能进行技术方案评审与知识输出。

加分项

  • 在WMT/IWSLT等机器翻译评测、ACL/EMNLP/NAACL/NeurIPS等会议/期刊有论文、共享任务或演讲记录;
  • 主导或核心参与过多语种翻译系统(≥20语种)在生产环境的上线与持续运营;
  • 对LLM在翻译场景的约束生成、术语对齐、错误类型可解释与纠错链路有系统方法与复用组件;
  • 熟悉端到端的语料管道搭建:网络爬取、去重、规则与模型混合打分、BT/BT-Tagging、QE(Quality Estimation)等;
  • 对AI Coding/Vibe Coding有充分的认知和实践经验。

Why Bitget?

  • Bitget is the world's leading web 3 platform for copy trading and one of the world's largest and most respected exchanges
  • We are a global company with staff members from over 50 different countries and regions
  • We are growing and looking for world-class ambitious talents to help us continue this journey
  • We have a streamlined structure that empowers employees to work efficiently, delivering the best results in a short timeframe
  • We offer competitive salaries and benefits
  • Blockchain technology and digital assets have the potential to change finance in a way no other technology can - be part of it!

If you are ambitious and believe that digital assets could be the next financial and technological revolution, please apply!

By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.