Connector question

Dear Spatial-MLLM team,

Thank you for the great work.

I’m a bit confused about the connector configuration. Is the connector trained from scratch, or is it partially initialized from the pre-trained Qwen2.5-VL? If it is trained from scratch, given that VLMs typically require large amounts of data to align vision and language features through the connector layer, the dataset size you report seems smaller than what’s typical in mainstream MLLM settings.

Could you please provide more details about how the connector is trained (e.g., initialization strategy, data scale, objectives, and training procedure)?

Best regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connector question #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Connector question #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions