中文字幕,欧美,日韩,中文字幕天天躁日日躁狠狠躁免费

在阿里通義今晨發(fā)布Qwen3-VL系列新成員Qwen3-VL-4B和Qwen3-VL-8B之際，英特爾于今日同步宣布，已經(jīng)在酷睿 Ultra 平臺上完成對這些最新模型的適配。此次Day 0支持延續(xù)了十天前對Qwen3新模型快速適配的卓越速度，再次印證了英特爾在加速AI技術(shù)創(chuàng)新、積極構(gòu)建模型合作生態(tài)方面的深度投入與行動力。

此次發(fā)布的Qwen3-VL系列新模型，在延續(xù)其卓越的文本理解和生成、深度視覺感知與推理、更長的上下文長度、增強的空間與視頻動態(tài)理解及強大代理交互能力的同時，憑借其輕量化的模型參數(shù)設(shè)計，在英特爾酷睿Ultra平臺上可以實現(xiàn)高效部署，為復(fù)雜的圖片和視頻理解及智能體應(yīng)用帶來更出色的性能與體驗。

為確保用戶能夠獲得更流暢的AI體驗，英特爾在酷睿Ultra平臺上，對Qwen3-VL-4B 模型進行了創(chuàng)新的CPU、GPU和NPU混合部署，充分釋放了XPU架構(gòu)的強大潛力。通過精巧地分解并優(yōu)化復(fù)雜的視覺語言模型負(fù)載鏈路，并將更多負(fù)載精準(zhǔn)調(diào)度至專用的NPU上，此次英特爾的Day 0支持實現(xiàn)了：

顯著的能效優(yōu)化：大幅降低CPU占用率，更好地支持用戶并發(fā)應(yīng)用。
卓越的性能表現(xiàn)：在混合部署場景中，模型運行吞吐量達到22.7tps。
流暢的用戶體驗：充分利用酷睿Ultra的跨平臺能力，提供無縫的AI交互。

以下的演示視頻充分地展示了該成果：Qwen3-VL-4B模型在圖片理解與分析任務(wù)中，在高效利用NPU算力的同時，顯著降低了CPU的資源占用。

（演示視頻: 在英特爾在酷睿Ultra平臺上，Qwen3-VL-4B釋放系統(tǒng)資源帶來流暢體驗）

快速上手指南

第一步 環(huán)境準(zhǔn)備

基于以下命令可以完成模型部署任務(wù)在Python上的環(huán)境安裝。

python -m venv py_venv

./py_venv/Scripts/activate.bat
pip uninstall -y optimum transformers optimum-intel

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 –index-url https://download.pytorch.org/whl/cpu

pip install git+https://github.com/openvino-dev-samples/optimum.git@qwen3vl

pip install git+https://github.com/openvino-dev-samples/transformers.git@qwen3vl

pip install git+https://github.com/openvino-dev-samples/optimum-intel.git@qwen3vl

pip install –pre -U openvino –extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

該示例在以下環(huán)境中已得到驗證：

硬件環(huán)境:
- 英特爾^? 酷睿? Ultra 7 258V
- iGPU驅(qū)動版本：32.0.101.6733
- 內(nèi)存: 32GB
操作系統(tǒng)：
- Windows 11 24H2 (26100.4061)
OpenVINO版本:
- openvino 2025.3.0

第二步 模型下載和轉(zhuǎn)換

在部署模型之前，首先需要將原始的PyTorch模型轉(zhuǎn)換為OpenVINO^TM的IR靜態(tài)圖格式，并對其進行壓縮，以實現(xiàn)更輕量化的部署和最佳的性能表現(xiàn)。通過Optimum提供的命令行工具optimum-cli，可以一鍵完成模型的格式轉(zhuǎn)換和權(quán)重量化任務(wù)：

optimum-cli export openvino –model Qwen/Qwen3-VL-4B-Instruct –trust-remote-code –weight-format int4 –task image-text-to-text Qwen3-VL-4B-Instruct-ov

開發(fā)者可以根據(jù)模型的輸出結(jié)果，調(diào)整其中的量化參數(shù)，包括：

–model：為模型在HuggingFace上的model id，這里也提前下載原始模型，并將model id替換為原始模型的本地路徑，針對國內(nèi)開發(fā)者，推薦使用ModelScope魔搭社區(qū)作為原始模型的下載渠道，具體加載方式可以參考ModelScope官方指南：https://www.modelscope.cn/docs/models/download
–weight-format：量化精度，可以選擇fp32,fp16,int8,int4,int4_sym_g128,int4_asym_g128,int4_sym_g64,int4_asym_g64
–group-size：權(quán)重里共享量化參數(shù)的通道數(shù)量
–ratio：int4/int8權(quán)重比例，默認(rèn)為1.0，0.6表示60%的權(quán)重以int4表，40%以int8表示
–sym：是否開啟對稱量化

第三步 模型部署

除了利用Optimum-cli工具導(dǎo)出OpenVINO^TM模型外，我們還在Optimum-intel中重構(gòu)了Qwen3-VL和Qwen3-VL-MOE模型的Pipeline，將官方示例示例中的的Qwen3VLForConditionalGeneration替換為OVModelForVisualCausalLM便可快速利用OpenVINO^TM進行模型部署，完整示例可參考以下代碼流程。

from transformers import AutoProcessor

from optimum.intel import OVModelForVisualCausalLM

# default: Load the model on the available device(s)

model = OVModelForVisualCausalLM.from_pretrained(

“Qwen3-VL-4B-Instruct-ov”, device=”GPU”

)

processor = AutoProcessor.from_pretrained(“Qwen3-VL-4B-Instruct-ov”)

messages = [

{

“role”: “user”,

“content”: [

{

“type”: “image”,

“image”: “https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg”,

{“type”: “text”, “text”: “Describe this image.”},

}

]

# Preparation for inference

inputs = processor.apply_chat_template(

messages,

tokenize=True,

add_generation_prompt=True,

return_dict=True,

return_tensors=”pt”

)

# Inference: Generation of the output

generated_ids = model.generate(**inputs, max_new_tokens=128)

generated_ids_trimmed = [

out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)

]

output_text = processor.batch_decode(

generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False

)

print(output_text)

以下為該模型在圖像理解任務(wù)中的輸出示例：

（圖片由AI生成，僅做效果演示）

‘This is a heartwarming, sun-drenched photograph capturing a tender moment between a woman and her dog on a beach at sunset.\n\n**Key Elements:**\n\n* **The Subjects:** A young woman with long dark hair, wearing a plaid shirt, sits on the sand. Beside her, a large, light-colored dog, likely a Labrador Retriever, sits attentively, wearing a harness. The two are engaged in a playful, paw-to-paw high-five or “pawshake” gesture, a clear sign of their bond.\n* **The Setting:** They are on a wide, sandy beach.

CPU 代號名	設(shè)備	?模型	精度	輸入規(guī)模	輸出規(guī)模	第二個+ token/秒
Lunar Lake	英特爾^? 酷睿? Ultra 7 258V(XPU)	Qwen3-VL-4B-Instruct	NF4	656(1024 for LLM)	128	22.7

*性能數(shù)據(jù)基于以下測試獲得：在搭載酷睿Ultra 7 258V處理器的平臺上，采用OpenVINO框架2025.4.0.dev20250922版本，所有計算均在XPU上完成。測試評估了首個token延遲和在nf4-mixed-cw-sym精度設(shè)置下處理1K輸入時的平均吞吐量。為保證數(shù)據(jù)可靠性，每個測試均在預(yù)熱后執(zhí)行三次，并取平均值作為最終結(jié)果。性能因使用方式、配置和其他因素而異。請訪問www.Intel.com/PerformanceIndex了解更多信息。

性能結(jié)果基于測試時的配置狀態(tài)，可能未反映所有公開可用的更新內(nèi)容。請參閱相關(guān)文檔以獲取配置詳情。沒有任何產(chǎn)品或組件能夠保證絕對安全。您的實際成本和結(jié)果可能會有所不同。

相關(guān)英特爾技術(shù)可能需要啟用相關(guān)硬件、軟件或激活服務(wù)。

分享到

英特爾

lixiangjing

算力豹主編

lixiangjing

相關(guān)推薦

近期文章

熱門標(biāo)簽