手把手教你Python+文心一言：AI看图写诗网页实战

作者：狼烟四起2025.10.10 16:43浏览量：3

简介：本文将手把手教你用Python和文心一言API搭建一个《AI看图写诗》网页应用，包含完整项目源码和详细实现步骤，适合Python初学者和AI应用开发者。

一、项目背景与目标

在AI技术快速发展的今天，图像与文本的跨模态交互已成为热门研究方向。本项目旨在通过Python和文心一言API，构建一个能够根据用户上传的图片自动生成诗歌的网页应用。用户只需上传图片，系统即可分析图片内容并生成对应的诗歌，实现”所见即所诗”的创意体验。

核心价值

技术实践：整合计算机视觉与自然语言处理技术
应用创新：探索AI在艺术创作领域的应用场景
教学价值：提供完整的Web开发+AI API调用实践案例

二、技术栈准备

1. 开发环境

Python 3.8+
Flask 2.0+（Web框架）
HTML5/CSS3/JavaScript（前端）
文心一言API（百度智能云）

2. 依赖库安装

pip install flask requests pillow

3. 文心一言API准备

登录百度智能云平台
创建文心一言应用并获取API Key
了解API调用规范（本项目使用文本生成接口）

三、项目架构设计

1. 系统架构

前端页面 → Flask后端 → 文心一言API
     ↑               ↓
图片上传       诗歌结果返回

2. 功能模块

图片上传模块
图片预处理模块
API调用模块
结果展示模块

四、详细实现步骤

1. 创建Flask项目基础结构

ai_poem/
├── app.py                # 主程序
├── templates/
│   └── index.html        # 前端页面
├── static/
│   ├── css/
│   │   └── style.css     # 样式表
│   └── js/
│       └── main.js       # 前端逻辑
└── requirements.txt      # 依赖列表

2. 前端页面实现（index.html）

<!DOCTYPE html>
<html>
<head>
    <title>AI看图写诗</title>
    <link rel="stylesheet" href="/static/css/style.css">
</head>
<body>
    <div class="container">
        <h1>AI看图写诗</h1>
        <form id="upload-form" enctype="multipart/form-data">
            <input type="file" id="image-upload" accept="image/*" required>
            <button type="submit">生成诗歌</button>
        </form>
        <div id="result">
            <img id="preview" src="#" alt="预览图" style="display:none;">
            <div id="poem-container"></div>
        </div>
    </div>
    <script src="/static/js/main.js"></script>
</body>
</html>

3. Flask后端实现（app.py）

from flask import Flask, render_template, request, jsonify
import requests
import base64
from io import BytesIO
from PIL import Image
import os
app = Flask(__name__)
# 文心一言API配置
API_KEY = "你的API_KEY"
SECRET_KEY = "你的SECRET_KEY"
API_URL = "https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/completions"
def get_access_token():
    """获取百度API访问令牌"""
    auth_url = f"https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={API_KEY}&client_secret={SECRET_KEY}"
    response = requests.get(auth_url)
    return response.json().get("access_token")
def generate_poem(image_prompt):
    """调用文心一言API生成诗歌"""
    headers = {
        'Content-Type': 'application/json',
    }
    data = {
        "messages": [
            {
                "role": "user",
                "content": f"根据以下图片描述创作一首中文诗歌：{image_prompt}"
            }
        ]
    }
    access_token = get_access_token()
    url = f"{API_URL}?access_token={access_token}"
    response = requests.post(url, headers=headers, json=data)
    return response.json().get("result", "")
def image_to_base64(image_path):
    """图片转base64编码"""
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')
def analyze_image(image_path):
    """简单图片分析（示例）"""
    # 实际应用中可接入图像识别API获取更准确描述
    img = Image.open(image_path)
    width, height = img.size
    # 简单判断图片类型（实际应用应更精确）
    if "sky" in image_path.lower() or "cloud" in image_path.lower():
        return "蓝天白云，广阔无垠"
    elif "flower" in image_path.lower():
        return "鲜花盛开，色彩斑斓"
    else:
        return f"一张{width}x{height}像素的图片，内容丰富"
@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        # 处理文件上传
        file = request.files['file']
        if file:
            # 保存临时文件
            temp_path = "temp.jpg"
            file.save(temp_path)
            # 图片分析
            image_desc = analyze_image(temp_path)
            # 生成诗歌
            poem = generate_poem(image_desc)
            # 读取图片用于预览
            with open(temp_path, "rb") as img_file:
                img_data = base64.b64encode(img_file.read()).decode('utf-8')
            # 清理临时文件
            os.remove(temp_path)
            return jsonify({
                "poem": poem,
                "image": img_data
            })
    return render_template('index.html')
if __name__ == '__main__':
    app.run(debug=True)

4. 前端交互实现（main.js）

document.getElementById('upload-form').addEventListener('submit', async function(e) {
    e.preventDefault();
    const formData = new FormData();
    const fileInput = document.getElementById('image-upload');
    formData.append('file', fileInput.files[0]);
    try {
        const response = await fetch('/', {
            method: 'POST',
            body: formData
        });
        const result = await response.json();
        // 显示图片预览
        const preview = document.getElementById('preview');
        preview.src = 'data:image/jpeg;base64,' + result.image;
        preview.style.display = 'block';
        // 显示诗歌
        const poemContainer = document.getElementById('poem-container');
        poemContainer.innerHTML = `<h3>生成的诗歌：</h3><p>${result.poem}</p>`;
    } catch (error) {
        console.error('Error:', error);
        alert('生成诗歌时出错，请重试');
    }
});

五、项目优化方向

1. 图像识别增强

集成专业图像识别API（如百度视觉技术）
实现更精确的图像内容描述

2. 诗歌生成优化

定制化诗歌风格参数
多轮对话优化生成结果
添加韵律检查功能

3. 用户体验提升

添加加载动画
实现诗歌历史记录
增加分享功能

六、完整项目源码

# 完整app.py代码（见上文）
# 完整HTML/JS/CSS代码（见上文结构）

七、部署指南

安装依赖：pip install -r requirements.txt

设置环境变量：

export API_KEY=你的API_KEY
export SECRET_KEY=你的SECRET_KEY

运行应用：python app.py
访问 http://localhost:5000

八、常见问题解决

API调用失败：检查API Key和Secret Key是否正确
图片上传失败：确保上传文件是图片格式
诗歌质量不高：调整图片描述的提示词
跨域问题：开发时设置app.config['DEBUG'] = True

九、项目扩展建议

添加用户系统，保存历史作品
实现不同诗歌体裁选择（五言、七言等）
开发移动端适配版本
添加多语言支持

十、技术要点总结

跨模态交互：实现图像到文本的转换
API集成：掌握第三方API的调用流程
前后端分离：理解基本的Web开发架构
错误处理：实现健壮的异常处理机制

本项目完整实现了从图片上传到诗歌生成的全流程，通过Flask框架构建Web服务，调用文心一言API实现AI创作。开发者可根据实际需求进一步扩展功能，如添加更精确的图像分析、优化诗歌生成参数等。完整的项目源码已提供，可直接运行或作为学习参考。

发表评论

开发者关注产品榜

最热文章

关于作者

被阅读数
被赞数
被收藏数

活动

咨询

开发者热搜