GPT-4V

Describe - Simply describing what is in an image

Interpret - The biggest of the lot, explain the meaning and provide more context behind an image. This is the layer deeper than a surface level description.

1711067515600646585photo1

1711067515600646585photo2

1711067515600646585photo3

1711067515600646585photo4

ChatGPT-4V Multimodal identifies and decodes a schematic from 1954.

1710270517364867203-1

1710270517364867203-2

1710270517364867203-3

1710270517364867203-4

Recommend - Offer critiques, suggested changes, or recommendations based off an image

1710708040226226564

1707147314660270588-1

1707147314660270588-2

 

 

Convert - Convert images into other forms (code, narrative, etc.) or generate something new. Massive opportunity to build a ton of product here. Major things to come

1710199020499726535

Extract - Extract entities within an image or provide structured output

1710525260578410754

1708557028149673990

Assist - Offer solutions based on the image

1709638697644109984

 

在游戏中给出过关建议

1709382225748226191-2

1709382225748226191-1

 

Evaluate - Subjective judgement based on the image

小狗可爱程度评估

1710482288658481154-3

1710482288658481154-2

1710482288658481154-3

1710382165639270485

1709636152628527468


https://ipfs.ee/ipfs/QmWfQv4RdSxpv57rTNknqCdARWwPykwYq6krLC7Md6MRjc/F8Fs8qWa0AAzk2z.jpg

references: https://x.com/GregKamradt/status/1711772496159252981?s=20