Merge pull request #1488 from yy1781051483/master

add xunfei v3.0
feat: support send knowledge base image
2026-06-03 02:27:09 +08:00 · 2023-11-17 16:29:39 +08:00 · 2023-11-17 16:27:44 +08:00 · 2023-11-16 10:54:24 +08:00 · 2023-11-10 17:16:15 +08:00 · 2023-11-10 17:13:13 +08:00
23 changed files with 347 additions and 82 deletions
--- a/README.md
+++ b/README.md
@@ -6,17 +6,17 @@

 - [x] **多端部署：** 有多种部署方式可选择且功能完备，目前已支持个人微信，微信公众号和企业微信应用等部署方式
 - [x] **基础对话：** 私聊及群聊的消息智能回复，支持多轮会话上下文记忆，支持 GPT-3.5, GPT-4, claude, 文心一言, 讯飞星火
- [x] **语音识别：** 可识别语音消息，通过文字或语音回复，支持 azure, baidu, google, openai等多种语音模型
- [x] **图片生成：** 支持图片生成 和 图生图（如照片修复），可选择 Dell-E, stable diffusion, replicate, midjourney模型
+- [x] **语音识别：** 可识别语音消息，通过文字或语音回复，支持 azure, baidu, google, openai(whisper/tts) 等多种语音模型
+- [x] **图片生成：** 支持图片生成 和 图生图（如照片修复），可选择 Dall-E, stable diffusion, replicate, midjourney模型
 - [x] **丰富插件：** 支持个性化插件扩展，已实现多角色切换、文字冒险、敏感词过滤、聊天记录总结、文档总结和对话等插件
 - [X] **Tool工具：** 与操作系统和互联网交互，支持最新信息搜索、数学计算、天气和资讯查询、网页总结，基于 [chatgpt-tool-hub](https://github.com/goldfishh/chatgpt-tool-hub) 实现
- [x] **知识库：** 通过上传知识库文件自定义专属机器人，可作为数字分身、领域知识库、智能客服使用，基于 [LinkAI](https://chat.link-ai.tech/console) 实现
+- [x] **知识库：** 通过上传知识库文件自定义专属机器人，可作为数字分身、领域知识库、智能客服使用，基于 [LinkAI](https://link-ai.tech/console) 实现

 > 欢迎接入更多应用，参考 [Terminal代码](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/terminal/terminal_channel.py)实现接收和发送消息逻辑即可接入。 同时欢迎增加新的插件，参考 [插件说明文档](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins)。

 # 演示

-https://user-images.githubusercontent.com/26161723/233777277-e3b9928e-b88f-43e2-b0e0-3cbc923bc799.mp4
+https://github.com/zhayujie/chatgpt-on-wechat/assets/26161723/d5154020-36e3-41db-8706-40ce9f3f1b1e

 Demo made by [Visionn](https://www.wangpc.cc/)

@@ -28,11 +28,15 @@ Demo made by [Visionn](https://www.wangpc.cc/)

 # 更新日志

+>**2023.11.10：** [1.5.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.0)，新增 `gpt-4-turbo`, `dall-e-3`, `tts` 模型接入，完善图像理解&生成、语音识别&生成的多模态能力
+
+>**2023.10.16：** 支持通过意图识别使用LinkAI联网搜索、数学计算、网页访问等插件，参考[插件文档](https://docs.link-ai.tech/platform/plugins)
+
 >**2023.09.26：** 插件增加 文件/文章链接 一键总结和对话的功能，使用参考：[插件说明](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins/linkai#3%E6%96%87%E6%A1%A3%E6%80%BB%E7%BB%93%E5%AF%B9%E8%AF%9D%E5%8A%9F%E8%83%BD)

 >**2023.08.08：** 接入百度文心一言模型，通过 [插件](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins/linkai) 支持 Midjourney 绘图

->**2023.06.12：** 接入 [LinkAI](https://chat.link-ai.tech/console) 平台，可在线创建领域知识库，并接入微信、公众号及企业微信中，打造专属客服机器人。使用参考 [接入文档](https://link-ai.tech/platform/link-app/wechat)。
+>**2023.06.12：** 接入 [LinkAI](https://link-ai.tech/console) 平台，可在线创建领域知识库，并接入微信、公众号及企业微信中，打造专属客服机器人。使用参考 [接入文档](https://link-ai.tech/platform/link-app/wechat)。

 >**2023.04.26：** 支持企业微信应用号部署，兼容插件，并支持语音图片交互，私人助理理想选择，[使用文档](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/wechatcom/README.md)。(contributed by [@lanvent](https://github.com/lanvent) in [#944](https://github.com/zhayujie/chatgpt-on-wechat/pull/944))

@@ -174,7 +178,7 @@ pip3 install azure-cognitiveservices-speech
 **5.LinkAI配置 (可选)**

 + `use_linkai`: 是否使用LinkAI接口，开启后可国内访问，使用知识库和 `Midjourney` 绘画, 参考 [文档](https://link-ai.tech/platform/link-app/wechat)
-+ `linkai_api_key`: LinkAI Api Key，可在 [控制台](https://chat.link-ai.tech/console/interface) 创建
+ `linkai_api_key`: LinkAI Api Key，可在 [控制台](https://link-ai.tech/console/interface) 创建
 + `linkai_app_code`: LinkAI 应用code，选填

 **本说明文档可能会未及时更新，当前所有可选的配置项均在该[`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py)中列出。**
@@ -267,7 +271,7 @@ volumes:

 FAQs： <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>

-或直接在线咨询 [项目小助手](https://chat.link-ai.tech/app/Kv2fXJcH)  (beta版本，语料完善中，回复仅供参考)
+或直接在线咨询 [项目小助手](https://link-ai.tech/app/Kv2fXJcH)  (beta版本，语料完善中，回复仅供参考)

 ## 联系

--- a/bot/baidu/baidu_wenxin.py
+++ b/bot/baidu/baidu_wenxin.py
@@ -16,7 +16,10 @@ class BaiduWenxinBot(Bot):

    def __init__(self):
        super().__init__()
-        self.sessions = SessionManager(BaiduWenxinSession, model=conf().get("baidu_wenxin_model") or "eb-instant")
+        wenxin_model = conf().get("baidu_wenxin_model") or "eb-instant"
+        if conf().get("model") and conf().get("model") == "wenxin-4":
+            wenxin_model = "completions_pro"
+        self.sessions = SessionManager(BaiduWenxinSession, model=wenxin_model)

    def reply(self, query, context=None):
        # acquire reply content
--- a/bot/chatgpt/chat_gpt_session.py
+++ b/bot/chatgpt/chat_gpt_session.py
@@ -1,5 +1,6 @@
 from bot.session_manager import Session
 from common.log import logger
+from common import const

 """
    e.g.  [
@@ -61,10 +62,10 @@ def num_tokens_from_messages(messages, model):

    import tiktoken

-    if model in ["gpt-3.5-turbo-0301", "gpt-35-turbo"]:
+    if model in ["gpt-3.5-turbo-0301", "gpt-35-turbo", "gpt-3.5-turbo-1106"]:
        return num_tokens_from_messages(messages, model="gpt-3.5-turbo")
    elif model in ["gpt-4-0314", "gpt-4-0613", "gpt-4-32k", "gpt-4-32k-0613", "gpt-3.5-turbo-0613",
-                   "gpt-3.5-turbo-16k", "gpt-3.5-turbo-16k-0613", "gpt-35-turbo-16k"]:
+                   "gpt-3.5-turbo-16k", "gpt-3.5-turbo-16k-0613", "gpt-35-turbo-16k", const.GPT4_TURBO_PREVIEW, const.GPT4_VISION_PREVIEW]:
        return num_tokens_from_messages(messages, model="gpt-4")

    try:
--- a/bot/linkai/link_ai_bot.py
+++ b/bot/linkai/link_ai_bot.py
@@ -7,15 +7,14 @@ import requests

 from bot.bot import Bot
 from bot.chatgpt.chat_gpt_session import ChatGPTSession
-from bot.openai.open_ai_image import OpenAIImage
 from bot.session_manager import SessionManager
 from bridge.context import Context, ContextType
 from bridge.reply import Reply, ReplyType
 from common.log import logger
 from config import conf, pconf
+import threading

-
-class LinkAIBot(Bot, OpenAIImage):
+class LinkAIBot(Bot):
    # authentication failed
    AUTH_FAILED_CODE = 401
    NO_QUOTA_CODE = 406
@@ -47,10 +46,10 @@ class LinkAIBot(Bot, OpenAIImage):
        :param retry_count: 当前递归重试次数
        :return: 回复
        """
-        if retry_count >= 2:
+        if retry_count > 2:
            # exit from retry 2 times
            logger.warn("[LINKAI] failed after maximum number of retry times")
-            return Reply(ReplyType.ERROR, "请再问我一次吧")
+            return Reply(ReplyType.TEXT, "请再问我一次吧")

        try:
            # load config
@@ -64,7 +63,7 @@ class LinkAIBot(Bot, OpenAIImage):
            session_id = context["session_id"]

            session = self.sessions.session_query(query, session_id)
-            model = conf().get("model") or "gpt-3.5-turbo"
+            model = conf().get("model")
            # remove system message
            if session.messages[0].get("role") == "system":
                if app_code or model == "wenxin":
@@ -96,9 +95,18 @@ class LinkAIBot(Bot, OpenAIImage):
                total_tokens = response["usage"]["total_tokens"]
                logger.info(f"[LINKAI] reply={reply_content}, total_tokens={total_tokens}")
                self.sessions.session_reply(reply_content, session_id, total_tokens)
-                suffix = self._fecth_knowledge_search_suffix(response)
-                if suffix:
-                    reply_content += suffix
+    
+                agent_suffix = self._fetch_agent_suffix(response)
+                if agent_suffix:
+                    reply_content += agent_suffix
+                if not agent_suffix:
+                    knowledge_suffix = self._fetch_knowledge_search_suffix(response)
+                    if knowledge_suffix:
+                        reply_content += knowledge_suffix
+                # image process
+                if response["choices"][0].get("img_urls"):
+                    thread = threading.Thread(target=self._send_image, args=(context.get("channel"), context, response["choices"][0].get("img_urls")))
+                    thread.start()
                return Reply(ReplyType.TEXT, reply_content)

            else:
@@ -113,7 +121,7 @@ class LinkAIBot(Bot, OpenAIImage):
                    logger.warn(f"[LINKAI] do retry, times={retry_count}")
                    return self._chat(query, context, retry_count + 1)

-                return Reply(ReplyType.ERROR, "提问太快啦，请休息一下再问我吧")
+                return Reply(ReplyType.TEXT, "提问太快啦，请休息一下再问我吧")

        except Exception as e:
            logger.exception(e)
@@ -188,14 +196,40 @@ class LinkAIBot(Bot, OpenAIImage):
            return self.reply_text(session, app_code, retry_count + 1)


-    def _fecth_knowledge_search_suffix(self, response) -> str:
+    def create_img(self, query, retry_count=0, api_key=None):
+        try:
+            logger.info("[LinkImage] image_query={}".format(query))
+            headers = {
+                "Content-Type": "application/json",
+                "Authorization": f"Bearer {conf().get('linkai_api_key')}"
+            }
+            data = {
+                "prompt": query,
+                "n": 1,
+                "model": conf().get("text_to_image") or "dall-e-2",
+                "response_format": "url",
+                "img_proxy": conf().get("image_proxy")
+            }
+            url = conf().get("linkai_api_base", "https://api.link-ai.chat") + "/v1/images/generations"
+            res = requests.post(url, headers=headers, json=data, timeout=(5, 90))
+            t2 = time.time()
+            image_url = res.json()["data"][0]["url"]
+            logger.info("[OPEN_AI] image_url={}".format(image_url))
+            return True, image_url
+
+        except Exception as e:
+            logger.error(format(e))
+            return False, "画图出现问题，请休息一下再问我吧"
+
+
+    def _fetch_knowledge_search_suffix(self, response) -> str:
        try:
            if response.get("knowledge_base"):
                search_hit = response.get("knowledge_base").get("search_hit")
                first_similarity = response.get("knowledge_base").get("first_similarity")
                logger.info(f"[LINKAI] knowledge base, search_hit={search_hit}, first_similarity={first_similarity}")
                plugin_config = pconf("linkai")
-                if plugin_config.get("knowledge_base") and plugin_config.get("knowledge_base").get("search_miss_text_enabled"):
+                if plugin_config and plugin_config.get("knowledge_base") and plugin_config.get("knowledge_base").get("search_miss_text_enabled"):
                    search_miss_similarity = plugin_config.get("knowledge_base").get("search_miss_similarity")
                    search_miss_text = plugin_config.get("knowledge_base").get("search_miss_suffix")
                    if not search_hit:
@@ -204,3 +238,41 @@ class LinkAIBot(Bot, OpenAIImage):
                        return search_miss_text
        except Exception as e:
            logger.exception(e)
+
+    def _fetch_agent_suffix(self, response):
+        try:
+            plugin_list = []
+            logger.debug(f"[LinkAgent] res={response}")
+            if response.get("agent") and response.get("agent").get("chain") and response.get("agent").get("need_show_plugin"):
+                chain = response.get("agent").get("chain")
+                suffix = "\n\n- - - - - - - - - - - -"
+                i = 0
+                for turn in chain:
+                    plugin_name = turn.get('plugin_name')
+                    suffix += "\n"
+                    need_show_thought = response.get("agent").get("need_show_thought")
+                    if turn.get("thought") and plugin_name and need_show_thought:
+                        suffix += f"{turn.get('thought')}\n"
+                    if plugin_name:
+                        plugin_list.append(turn.get('plugin_name'))
+                        suffix += f"{turn.get('plugin_icon')} {turn.get('plugin_name')}"
+                        if turn.get('plugin_input'):
+                            suffix += f"：{turn.get('plugin_input')}"
+                    if i < len(chain) - 1:
+                        suffix += "\n"
+                    i += 1
+                logger.info(f"[LinkAgent] use plugins: {plugin_list}")
+                return suffix
+        except Exception as e:
+            logger.exception(e)
+
+
+    def _send_image(self, channel, context, image_urls):
+        if not image_urls:
+            return
+        try:
+            for url in image_urls:
+                reply = Reply(ReplyType.IMAGE_URL, url)
+                channel.send(reply, context)
+        except Exception as e:
+            logger.error(e)
--- a/bot/openai/open_ai_image.py
+++ b/bot/openai/open_ai_image.py
@@ -24,7 +24,8 @@ class OpenAIImage(object):
                api_key=api_key,
                prompt=query,  # 图片描述
                n=1,  # 每次生成图片的数量
-                size=conf().get("image_create_size", "256x256"),  # 图片大小,可选有 256x256, 512x512, 1024x1024
+                model=conf().get("text_to_image") or "dall-e-2",
+                # size=conf().get("image_create_size", "256x256"),  # 图片大小,可选有 256x256, 512x512, 1024x1024
            )
            image_url = response["data"][0]["url"]
            logger.info("[OPEN_AI] image_url={}".format(image_url))
@@ -36,7 +37,7 @@ class OpenAIImage(object):
                logger.warn("[OPEN_AI] ImgCreate RateLimit exceed, 第{}次重试".format(retry_count + 1))
                return self.create_img(query, retry_count + 1)
            else:
-                return False, "提问太快啦，请休息一下再问我吧"
+                return False, "画图出现问题，请休息一下再问我吧"
        except Exception as e:
            logger.exception(e)
-            return False, str(e)
+            return False, "画图出现问题，请休息一下再问我吧"
--- a/bot/xunfei/xunfei_spark_bot.py
+++ b/bot/xunfei/xunfei_spark_bot.py
@@ -40,10 +40,11 @@ class XunFeiBot(Bot):
        self.app_id = conf().get("xunfei_app_id")
        self.api_key = conf().get("xunfei_api_key")
        self.api_secret = conf().get("xunfei_api_secret")
-        # 默认使用v2.0版本，1.5版本可设置为 general
+        # 默认使用v3.0版本，2.0版本可设置为generalv2，  1.5版本可设置为 general
        self.domain = "generalv2"
-        # 默认使用v2.0版本，1.5版本可设置为 "ws://spark-api.xf-yun.com/v1.1/chat"
-        self.spark_url = "ws://spark-api.xf-yun.com/v2.1/chat"
+        # 默认使用v3.0版本，1.5版本可设置为 "ws://spark-api.xf-yun.com/v1.1/chat"，
+        # 2.0版本可设置为 "ws://spark-api.xf-yun.com/v2.1/chat"
+        self.spark_url = "ws://spark-api.xf-yun.com/v3.1/chat"
        self.host = urlparse(self.spark_url).netloc
        self.path = urlparse(self.spark_url).path
        # 和wenxin使用相同的session机制
@@ -56,7 +57,8 @@ class XunFeiBot(Bot):
            request_id = self.gen_request_id(session_id)
            reply_map[request_id] = ""
            session = self.sessions.session_query(query, session_id)
-            threading.Thread(target=self.create_web_socket, args=(session.messages, request_id)).start()
+            threading.Thread(target=self.create_web_socket,
+                             args=(session.messages, request_id)).start()
            depth = 0
            time.sleep(0.1)
            t1 = time.time()
@@ -83,20 +85,27 @@ class XunFeiBot(Bot):
                    depth += 1
                    continue
            t2 = time.time()
-            logger.info(f"[XunFei-API] response={reply_map[request_id]}, time={t2 - t1}s, usage={usage}")
-            self.sessions.session_reply(reply_map[request_id], session_id, usage.get("total_tokens"))
+            logger.info(
+                f"[XunFei-API] response={reply_map[request_id]}, time={t2 - t1}s, usage={usage}"
+            )
+            self.sessions.session_reply(reply_map[request_id], session_id,
+                                        usage.get("total_tokens"))
            reply = Reply(ReplyType.TEXT, reply_map[request_id])
            del reply_map[request_id]
            return reply
        else:
-            reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
+            reply = Reply(ReplyType.ERROR,
+                          "Bot不支持处理{}类型的消息".format(context.type))
            return reply

    def create_web_socket(self, prompt, session_id, temperature=0.5):
        logger.info(f"[XunFei] start connect, prompt={prompt}")
        websocket.enableTrace(False)
        wsUrl = self.create_url()
-        ws = websocket.WebSocketApp(wsUrl, on_message=on_message, on_error=on_error, on_close=on_close,
+        ws = websocket.WebSocketApp(wsUrl,
+                                    on_message=on_message,
+                                    on_error=on_error,
+                                    on_close=on_close,
                                    on_open=on_open)
        data_queue = queue.Queue(1000)
        queue_map[session_id] = data_queue
@@ -108,7 +117,8 @@ class XunFeiBot(Bot):
        ws.run_forever(sslopt={"cert_reqs": ssl.CERT_NONE})

    def gen_request_id(self, session_id: str):
-        return session_id + "_" + str(int(time.time())) + "" + str(random.randint(0, 100))
+        return session_id + "_" + str(int(time.time())) + "" + str(
+            random.randint(0, 100))

    # 生成url
    def create_url(self):
@@ -122,22 +132,21 @@ class XunFeiBot(Bot):
        signature_origin += "GET " + self.path + " HTTP/1.1"

        # 进行hmac-sha256进行加密
-        signature_sha = hmac.new(self.api_secret.encode('utf-8'), signature_origin.encode('utf-8'),
+        signature_sha = hmac.new(self.api_secret.encode('utf-8'),
+                                 signature_origin.encode('utf-8'),
                                 digestmod=hashlib.sha256).digest()

-        signature_sha_base64 = base64.b64encode(signature_sha).decode(encoding='utf-8')
+        signature_sha_base64 = base64.b64encode(signature_sha).decode(
+            encoding='utf-8')

        authorization_origin = f'api_key="{self.api_key}", algorithm="hmac-sha256", headers="host date request-line", ' \
                               f'signature="{signature_sha_base64}"'

-        authorization = base64.b64encode(authorization_origin.encode('utf-8')).decode(encoding='utf-8')
+        authorization = base64.b64encode(
+            authorization_origin.encode('utf-8')).decode(encoding='utf-8')

        # 将请求的鉴权参数组合为字典
-        v = {
-            "authorization": authorization,
-            "date": date,
-            "host": self.host
-        }
+        v = {"authorization": authorization, "date": date, "host": self.host}
        # 拼接鉴权参数，生成url
        url = self.spark_url + '?' + urlencode(v)
        # 此处打印出建立连接时候的url,参考本demo的时候可取消上方打印的注释，比对相同参数时生成的url与自己代码生成的url是否一致
@@ -190,11 +199,15 @@ def on_close(ws, one, two):
 # 收到websocket连接建立的处理
 def on_open(ws):
    logger.info(f"[XunFei] Start websocket, session_id={ws.session_id}")
-    thread.start_new_thread(run, (ws,))
+    thread.start_new_thread(run, (ws, ))


 def run(ws, *args):
-    data = json.dumps(gen_params(appid=ws.appid, domain=ws.domain, question=ws.question, temperature=ws.temperature))
+    data = json.dumps(
+        gen_params(appid=ws.appid,
+                   domain=ws.domain,
+                   question=ws.question,
+                   temperature=ws.temperature))
    ws.send(data)


@@ -212,7 +225,8 @@ def on_message(ws, message):
        content = choices["text"][0]["content"]
        data_queue = queue_map.get(ws.session_id)
        if not data_queue:
-            logger.error(f"[XunFei] can't find data queue, session_id={ws.session_id}")
+            logger.error(
+                f"[XunFei] can't find data queue, session_id={ws.session_id}")
            return
        reply_item = ReplyItem(content)
        if status == 2:
--- a/bridge/bridge.py
+++ b/bridge/bridge.py
@@ -18,17 +18,21 @@ class Bridge(object):
            "text_to_voice": conf().get("text_to_voice", "google"),
            "translate": conf().get("translate", "baidu"),
        }
-        model_type = conf().get("model")
+        model_type = conf().get("model") or const.GPT35
        if model_type in ["text-davinci-003"]:
            self.btype["chat"] = const.OPEN_AI
        if conf().get("use_azure_chatgpt", False):
            self.btype["chat"] = const.CHATGPTONAZURE
-        if model_type in ["wenxin"]:
+        if model_type in ["wenxin", "wenxin-4"]:
            self.btype["chat"] = const.BAIDU
        if model_type in ["xunfei"]:
            self.btype["chat"] = const.XUNFEI
        if conf().get("use_linkai") and conf().get("linkai_api_key"):
            self.btype["chat"] = const.LINKAI
+            if not conf().get("voice_to_text") or conf().get("voice_to_text") in ["openai"]:
+                self.btype["voice_to_text"] = const.LINKAI
+            if not conf().get("text_to_voice") or conf().get("text_to_voice") in ["openai", const.TTS_1, const.TTS_1_HD]:
+                self.btype["text_to_voice"] = const.LINKAI
        if model_type in ["claude"]:
            self.btype["chat"] = const.CLAUDEAI
        self.bots = {}
--- a/channel/chat_channel.py
+++ b/channel/chat_channel.py
@@ -91,6 +91,7 @@ class ChatChannel(Channel):
        # 消息内容匹配过程，并处理content
        if ctype == ContextType.TEXT:
            if first_in and "」\n- - - - - - -" in content:  # 初次匹配 过滤引用消息
+                logger.debug(content)
                logger.debug("[WX]reference query skipped")
                return None

@@ -174,6 +175,7 @@ class ChatChannel(Channel):
            if e_context.is_break():
                context["generate_breaked_by"] = e_context["breaked_by"]
            if context.type == ContextType.TEXT or context.type == ContextType.IMAGE_CREATE:  # 文字和图片消息
+                context["channel"] = e_context["channel"]
                reply = super().build_reply_content(context.content, context)
            elif context.type == ContextType.VOICE:  # 语音消息
                cmsg = context["msg"]
--- a/channel/wechat/wechat_channel.py
+++ b/channel/wechat/wechat_channel.py
@@ -142,6 +142,9 @@ class WechatChannel(ChatChannel):
    @time_checker
    @_check
    def handle_single(self, cmsg: ChatMessage):
+        # filter system message
+        if cmsg.other_user_id in ["weixin"]:
+            return
        if cmsg.ctype == ContextType.VOICE:
            if conf().get("speech_recognition") != True:
                return
--- a/common/const.py
+++ b/common/const.py
@@ -5,8 +5,15 @@ BAIDU = "baidu"
 XUNFEI = "xunfei"
 CHATGPTONAZURE = "chatGPTOnAzure"
 LINKAI = "linkai"
-
-VERSION = "1.3.0"
-
 CLAUDEAI = "claude"
-MODEL_LIST = ["gpt-3.5-turbo", "gpt-3.5-turbo-16k", "gpt-4", "wenxin", "xunfei","claude"]
+
+# model
+GPT35 = "gpt-3.5-turbo"
+GPT4 = "gpt-4"
+GPT4_TURBO_PREVIEW = "gpt-4-1106-preview"
+GPT4_VISION_PREVIEW = "gpt-4-vision-preview"
+WHISPER_1 = "whisper-1"
+TTS_1 = "tts-1"
+TTS_1_HD = "tts-1-hd"
+
+MODEL_LIST = ["gpt-3.5-turbo", "gpt-3.5-turbo-16k", "gpt-4", "wenxin", "wenxin-4", "xunfei", "claude", "gpt-4-turbo", GPT4_TURBO_PREVIEW]
--- a/config-template.json
+++ b/config-template.json
@@ -1,7 +1,10 @@
 {
-  "open_ai_api_key": "YOUR API KEY",
-  "model": "gpt-3.5-turbo",
  "channel_type": "wx",
+  "model": "",
+  "open_ai_api_key": "YOUR API KEY",
+  "text_to_image": "dall-e-2",
+  "voice_to_text": "openai",
+  "text_to_voice": "openai",
  "proxy": "",
  "hot_reload": false,
  "single_chat_prefix": [
@@ -22,10 +25,10 @@
  "image_create_prefix": [
    "画"
  ],
-  "speech_recognition": false,
+  "speech_recognition": true,
  "group_speech_recognition": false,
  "voice_reply_voice": false,
-  "conversation_max_tokens": 1000,
+  "conversation_max_tokens": 2500,
  "expires_in_seconds": 3600,
  "character_desc": "你是ChatGPT, 一个由OpenAI训练的大型语言模型, 你旨在回答并解决人们的任何问题，并且可以使用多种语言与人交流。",
  "temperature": 0.7,
--- a/config.py
+++ b/config.py
@@ -16,7 +16,7 @@ available_setting = {
    "open_ai_api_base": "https://api.openai.com/v1",
    "proxy": "",  # openai使用的代理
    # chatgpt模型， 当use_azure_chatgpt为true时，其名称为Azure上model deployment名称
-    "model": "gpt-3.5-turbo",  # 还支持 gpt-3.5-turbo-16k, gpt-4, wenxin, xunfei
+    "model": "gpt-3.5-turbo",  # 还支持 gpt-4, gpt-4-turbo, wenxin, xunfei
    "use_azure_chatgpt": False,  # 是否使用azure的chatgpt
    "azure_deployment_id": "",  # azure 模型部署名称
    "azure_api_version": "",  # azure api版本
@@ -32,10 +32,13 @@ available_setting = {
    "group_name_white_list": ["ChatGPT测试群", "ChatGPT测试群2"],  # 开启自动回复的群名称列表
    "group_name_keyword_white_list": [],  # 开启自动回复的群名称关键词列表
    "group_chat_in_one_session": ["ChatGPT测试群"],  # 支持会话上下文共享的群名称
+    "group_welcome_msg": "",  # 配置新人进群固定欢迎语，不配置则使用随机风格欢迎 
    "trigger_by_self": False,  # 是否允许机器人触发
+    "text_to_image": "dall-e-2",  # 图片生成模型，可选 dall-e-2, dall-e-3
+    "image_proxy": True,  # 是否需要图片代理，国内访问LinkAI时需要
    "image_create_prefix": ["画", "看", "找"],  # 开启图片回复的前缀
    "concurrency_in_session": 1,  # 同一会话最多有多少条消息在处理中，大于1可能乱序
-    "image_create_size": "256x256",  # 图片大小,可选有 256x256, 512x512, 1024x1024
+    "image_create_size": "256x256",  # 图片大小,可选有 256x256, 512x512, 1024x1024 (dall-e-3默认为1024x1024)
    # chatgpt会话参数
    "expires_in_seconds": 3600,  # 无操作会话的过期时间
    # 人格描述
@@ -49,7 +52,7 @@ available_setting = {
    "top_p": 1,
    "frequency_penalty": 0,
    "presence_penalty": 0,
-    "request_timeout": 60,  # chatgpt请求超时时间，openai接口默认设置为600，对于难问题一般需要较长时间
+    "request_timeout": 180,  # chatgpt请求超时时间，openai接口默认设置为600，对于难问题一般需要较长时间
    "timeout": 120,  # chatgpt重试超时时间，在这个时间内，将会自动重试
    # Baidu 文心一言参数
    "baidu_wenxin_model": "eb-instant",  # 默认使用ERNIE-Bot-turbo模型
@@ -65,12 +68,14 @@ available_setting = {
    # wework的通用配置
    "wework_smart": True,  # 配置wework是否使用已登录的企业微信，False为多开
    # 语音设置
-    "speech_recognition": False,  # 是否开启语音识别
+    "speech_recognition": True,  # 是否开启语音识别
    "group_speech_recognition": False,  # 是否开启群组语音识别
    "voice_reply_voice": False,  # 是否使用语音回复语音，需要设置对应语音合成引擎的api key
    "always_reply_voice": False,  # 是否一直使用语音回复
    "voice_to_text": "openai",  # 语音识别引擎，支持openai,baidu,google,azure
-    "text_to_voice": "baidu",  # 语音合成引擎，支持baidu,google,pytts(offline),azure,elevenlabs
+    "text_to_voice": "openai",  # 语音合成引擎，支持openai,baidu,google,pytts(offline),azure,elevenlabs
+    "text_to_voice_model": "tts-1",
+    "tts_voice_id": "alloy",
    # baidu 语音api配置， 使用百度语音识别和语音合成时需要
    "baidu_app_id": "",
    "baidu_api_key": "",
--- a/plugins/godcmd/godcmd.py
+++ b/plugins/godcmd/godcmd.py
@@ -136,9 +136,9 @@ ADMIN_COMMANDS = {

 # 定义帮助函数
 def get_help_text(isadmin, isgroup):
-    help_text = "通用指令：\n"
+    help_text = "通用指令\n"
    for cmd, info in COMMANDS.items():
-        if cmd == "auth":  # 不提示认证指令
+        if cmd in ["auth", "set_openai_api_key", "reset_openai_api_key", "set_gpt_model", "reset_gpt_model", "gpt_model"]:  # 不显示帮助指令
            continue
        if cmd == "id" and conf().get("channel_type", "wx") not in ["wxy", "wechatmp"]:
            continue
@@ -151,7 +151,7 @@ def get_help_text(isadmin, isgroup):

    # 插件指令
    plugins = PluginManager().list_plugins()
-    help_text += "\n目前可用插件有："
+    help_text += "\n可用插件"
    for plugin in plugins:
        if plugins[plugin].enabled and not plugins[plugin].hidden:
            namecn = plugins[plugin].namecn
@@ -266,14 +266,16 @@ class Godcmd(Plugin):
                    if not isadmin and not self.is_admin_in_group(e_context["context"]):
                        ok, result = False, "需要管理员权限执行"
                    elif len(args) == 0:
-                        ok, result = True, "当前模型为: " + str(conf().get("model"))
+                        model = conf().get("model") or const.GPT35
+                        ok, result = True, "当前模型为: " + str(model)
                    elif len(args) == 1:
                        if args[0] not in const.MODEL_LIST:
                            ok, result = False, "模型名称不存在"
                        else:
-                            conf()["model"] = args[0]
+                            conf()["model"] = self.model_mapping(args[0])
                            Bridge().reset_bot()
-                            ok, result = True, "模型设置为: " + str(conf().get("model"))
+                            model = conf().get("model") or const.GPT35
+                            ok, result = True, "模型设置为: " + str(model)
                elif cmd == "id":
                    ok, result = True, user
                elif cmd == "set_openai_api_key":
@@ -467,3 +469,9 @@ class Godcmd(Plugin):
        if context["isgroup"]:
            return context.kwargs.get("msg").actual_user_id in global_config["admin_users"]
        return False
+
+
+    def model_mapping(self, model) -> str:
+        if model == "gpt-4-turbo":
+            return const.GPT4_TURBO_PREVIEW
+        return model
--- a/plugins/hello/hello.py
+++ b/plugins/hello/hello.py
@@ -6,6 +6,7 @@ from bridge.reply import Reply, ReplyType
 from channel.chat_message import ChatMessage
 from common.log import logger
 from plugins import *
+from config import conf


@plugins.register(
@@ -31,6 +32,13 @@ class Hello(Plugin):
            return

        if e_context["context"].type == ContextType.JOIN_GROUP:
+            if "group_welcome_msg" in conf():
+                reply = Reply()
+                reply.type = ReplyType.TEXT
+                reply.content = conf().get("group_welcome_msg", "")
+                e_context["reply"] = reply
+                e_context.action = EventAction.BREAK_PASS  # 事件结束，并跳过处理context的默认逻辑
+                return
            e_context["context"].type = ContextType.TEXT
            msg: ChatMessage = e_context["context"]["msg"]
            e_context["context"].content = f'请你随机使用一种风格说一句问候语来欢迎新用户"{msg.actual_user_nickname}"加入群聊。'
--- a/plugins/linkai/README.md
+++ b/plugins/linkai/README.md
@@ -1,6 +1,6 @@
 ## 插件说明

-基于 LinkAI 提供的知识库、Midjourney绘画、文档对话等能力对机器人的功能进行增强。平台地址: https://chat.link-ai.tech/console
+基于 LinkAI 提供的知识库、Midjourney绘画、文档对话等能力对机器人的功能进行增强。平台地址: https://link-ai.tech/console

 ## 插件配置

@@ -25,12 +25,13 @@
    "summary": {
        "enabled": true,              # 文档总结和对话功能开关
        "group_enabled": true,        # 是否支持群聊开启
-        "max_file_size": 5000        # 文件的大小限制，单位KB，默认为5M，超过该大小直接忽略
+        "max_file_size": 5000,        # 文件的大小限制，单位KB，默认为5M，超过该大小直接忽略
+        "type": ["FILE", "SHARING", "IMAGE"]  # 支持总结的类型，分别表示 文件、分享链接、图片
    }
 }
 ```

-根目录 `config.json` 中配置，`API_KEY` 在 [控制台](https://chat.link-ai.tech/console/interface) 中创建并复制过来:
+根目录 `config.json` 中配置，`API_KEY` 在 [控制台](https://link-ai.tech/console/interface) 中创建并复制过来:

 ```bash
 "linkai_api_key": "Link_xxxxxxxxx"
@@ -99,7 +100,7 @@

 #### 使用

-功能开启后，向机器人发送 **文件** 或 **分享链接卡片** 即可生成摘要，进一步可以与文件或链接的内容进行多轮对话。
+功能开启后，向机器人发送 **文件**、 **分享链接卡片**、**图片** 即可生成摘要，进一步可以与文件或链接的内容进行多轮对话。如果需要关闭某种类型的内容总结，设置 `summary`配置中的type字段即可。

 #### 限制

--- a/plugins/linkai/config.json.template
+++ b/plugins/linkai/config.json.template
@@ -14,6 +14,7 @@
    "summary": {
        "enabled": true,
        "group_enabled": true,
-        "max_file_size": 5000
+        "max_file_size": 5000,
+        "type": ["FILE", "SHARING", "IMAGE"]
    }
 }
--- a/plugins/linkai/linkai.py
+++ b/plugins/linkai/linkai.py
@@ -46,19 +46,24 @@ class LinkAI(Plugin):
            # filter content no need solve
            return

-        if context.type == ContextType.FILE and self._is_summary_open(context):
+        if context.type in [ContextType.FILE, ContextType.IMAGE] and self._is_summary_open(context):
            # 文件处理
            context.get("msg").prepare()
            file_path = context.content
            if not LinkSummary().check_file(file_path, self.sum_config):
                return
-            _send_info(e_context, "正在为你加速生成摘要，请稍后")
+            if context.type != ContextType.IMAGE:
+                _send_info(e_context, "正在为你加速生成摘要，请稍后")
            res = LinkSummary().summary_file(file_path)
            if not res:
-                _set_reply_text("因为神秘力量无法获取文章内容，请稍后再试吧", e_context, level=ReplyType.TEXT)
+                if context.type != ContextType.IMAGE:
+                    _set_reply_text("因为神秘力量无法获取内容，请稍后再试吧", e_context, level=ReplyType.TEXT)
                return
-            USER_FILE_MAP[_find_user_id(context) + "-sum_id"] = res.get("summary_id")
-            _set_reply_text(res.get("summary") + "\n\n💬 发送 \"开启对话\" 可以开启与文件内容的对话", e_context, level=ReplyType.TEXT)
+            summary_text = res.get("summary")
+            if context.type != ContextType.IMAGE:
+                USER_FILE_MAP[_find_user_id(context) + "-sum_id"] = res.get("summary_id")
+                summary_text += "\n\n💬 发送 \"开启对话\" 可以开启与文件内容的对话"
+            _set_reply_text(summary_text, e_context, level=ReplyType.TEXT)
            os.remove(file_path)
            return

@@ -187,6 +192,11 @@ class LinkAI(Plugin):
            return False
        if context.kwargs.get("isgroup") and not self.sum_config.get("group_enabled"):
            return False
+        support_type = self.sum_config.get("type")
+        if not support_type:
+            return True
+        if context.type.name not in support_type:
+            return False
        return True

    # LinkAI 对话任务处理
@@ -220,7 +230,7 @@ class LinkAI(Plugin):

    def get_help_text(self, verbose=False, **kwargs):
        trigger_prefix = _get_trigger_prefix()
-        help_text = "用于集成 LinkAI 提供的知识库、Midjourney绘画、文档总结对话等能力。\n\n"
+        help_text = "用于集成 LinkAI 提供的知识库、Midjourney绘画、文档总结、联网搜索等能力。\n\n"
        if not verbose:
            return help_text
        help_text += f'📖 知识库\n - 群聊中指定应用: {trigger_prefix}linkai app 应用编码\n'
--- a/plugins/linkai/summary.py
+++ b/plugins/linkai/summary.py
@@ -13,7 +13,8 @@ class LinkSummary:
            "file": open(file_path, "rb"),
            "name": file_path.split("/")[-1],
        }
-        res = requests.post(url=self.base_url() + "/v1/summary/file", headers=self.headers(), files=file_body, timeout=(5, 300))
+        url = self.base_url() + "/v1/summary/file"
+        res = requests.post(url, headers=self.headers(), files=file_body, timeout=(5, 300))
        return self._parse_summary_res(res)

    def summary_url(self, url: str):
@@ -71,7 +72,7 @@ class LinkSummary:
            return False

        suffix = file_path.split(".")[-1]
-        support_list = ["txt", "csv", "docx", "pdf", "md"]
+        support_list = ["txt", "csv", "docx", "pdf", "md", "jpg", "jpeg", "png"]
        if suffix not in support_list:
            logger.warn(f"[LinkSum] unsupported file, suffix={suffix}, support_list={support_list}")
            return False
--- a/plugins/source.json
+++ b/plugins/source.json
@@ -15,6 +15,10 @@
    "timetask": {
      "url": "https://github.com/haikerapples/timetask.git",
      "desc": "一款定时任务系统的插件"
+    },
+    "Apilot": {
+      "url": "https://github.com/6vision/Apilot.git",
+      "desc": "通过api直接查询早报、热榜、快递、天气等实用信息的插件"
    }
  }
 }
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,4 +1,4 @@
-openai>=0.27.8
+openai==0.27.8
 HTMLParser>=0.0.2
 PyQRCode>=1.2.1
 qrcode>=7.4.2
--- a/voice/factory.py
+++ b/voice/factory.py
@@ -33,4 +33,8 @@ def create_voice(voice_type):
        from voice.elevent.elevent_voice import ElevenLabsVoice

        return ElevenLabsVoice()
+
+    elif voice_type == "linkai":
+        from voice.linkai.linkai_voice import LinkAIVoice
+        return LinkAIVoice()
    raise RuntimeError
--- a/voice/linkai/linkai_voice.py
+++ b/voice/linkai/linkai_voice.py
@@ -0,0 +1,82 @@
+"""
+google voice service
+"""
+import random
+import requests
+from voice import audio_convert
+from bridge.reply import Reply, ReplyType
+from common.log import logger
+from config import conf
+from voice.voice import Voice
+from common import const
+import os
+import datetime
+
+class LinkAIVoice(Voice):
+    def __init__(self):
+        pass
+
+    def voiceToText(self, voice_file):
+        logger.debug("[LinkVoice] voice file name={}".format(voice_file))
+        try:
+            url = conf().get("linkai_api_base", "https://api.link-ai.chat") + "/v1/audio/transcriptions"
+            headers = {"Authorization": "Bearer " + conf().get("linkai_api_key")}
+            model = None
+            if not conf().get("text_to_voice") or conf().get("voice_to_text") == "openai":
+                model = const.WHISPER_1
+            if voice_file.endswith(".amr"):
+                try:
+                    mp3_file = os.path.splitext(voice_file)[0] + ".mp3"
+                    audio_convert.any_to_mp3(voice_file, mp3_file)
+                    voice_file = mp3_file
+                except Exception as e:
+                    logger.warn(f"[LinkVoice] amr file transfer failed, directly send amr voice file: {format(e)}")
+            file = open(voice_file, "rb")
+            file_body = {
+                "file": file
+            }
+            data = {
+                "model": model
+            }
+            res = requests.post(url, files=file_body, headers=headers, data=data, timeout=(5, 60))
+            if res.status_code == 200:
+                text = res.json().get("text")
+            else:
+                res_json = res.json()
+                logger.error(f"[LinkVoice] voiceToText error, status_code={res.status_code}, msg={res_json.get('message')}")
+                return None
+            reply = Reply(ReplyType.TEXT, text)
+            logger.info(f"[LinkVoice] voiceToText success, text={text}, file name={voice_file}")
+        except Exception as e:
+            logger.error(e)
+            return None
+        return reply
+
+    def textToVoice(self, text):
+        try:
+            url = conf().get("linkai_api_base", "https://api.link-ai.chat") + "/v1/audio/speech"
+            headers = {"Authorization": "Bearer " + conf().get("linkai_api_key")}
+            model = const.TTS_1
+            if not conf().get("text_to_voice") or conf().get("text_to_voice") in ["openai", const.TTS_1, const.TTS_1_HD]:
+                model = conf().get("text_to_voice_model") or const.TTS_1
+            data = {
+                "model": model,
+                "input": text,
+                "voice": conf().get("tts_voice_id")
+            }
+            res = requests.post(url, headers=headers, json=data, timeout=(5, 120))
+            if res.status_code == 200:
+                tmp_file_name = "tmp/" + datetime.datetime.now().strftime('%Y%m%d%H%M%S') + str(random.randint(0, 1000)) + ".mp3"
+                with open(tmp_file_name, 'wb') as f:
+                    f.write(res.content)
+                reply = Reply(ReplyType.VOICE, tmp_file_name)
+                logger.info(f"[LinkVoice] textToVoice success, input={text}, model={model}, voice_id={data.get('voice')}")
+                return reply
+            else:
+                res_json = res.json()
+                logger.error(f"[LinkVoice] textToVoice error, status_code={res.status_code}, msg={res_json.get('message')}")
+                return None
+        except Exception as e:
+            logger.error(e)
+            # reply = Reply(ReplyType.ERROR, "遇到了一点小问题，请稍后再问我吧")
+            return None
--- a/voice/openai/openai_voice.py
+++ b/voice/openai/openai_voice.py
@@ -9,7 +9,9 @@ from bridge.reply import Reply, ReplyType
 from common.log import logger
 from config import conf
 from voice.voice import Voice
-
+import requests
+from common import const
+import datetime, random

 class OpenaiVoice(Voice):
    def __init__(self):
@@ -24,6 +26,31 @@ class OpenaiVoice(Voice):
            reply = Reply(ReplyType.TEXT, text)
            logger.info("[Openai] voiceToText text={} voice file name={}".format(text, voice_file))
        except Exception as e:
-            reply = Reply(ReplyType.ERROR, str(e))
+            reply = Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
        finally:
            return reply
+
+
+    def textToVoice(self, text):
+        try:
+            url = 'https://api.openai.com/v1/audio/speech'
+            headers = {
+                'Authorization': 'Bearer ' + conf().get("open_ai_api_key"),
+                'Content-Type': 'application/json'
+            }
+            data = {
+                'model': conf().get("text_to_voice_model") or const.TTS_1,
+                'input': text,
+                'voice': conf().get("tts_voice_id") or "alloy"
+            }
+            response = requests.post(url, headers=headers, json=data)
+            file_name = "tmp/" + datetime.datetime.now().strftime('%Y%m%d%H%M%S') + str(random.randint(0, 1000)) + ".mp3"
+            logger.debug(f"[OPENAI] text_to_Voice file_name={file_name}, input={text}")
+            with open(file_name, 'wb') as f:
+                f.write(response.content)
+            logger.info(f"[OPENAI] text_to_Voice success")
+            reply = Reply(ReplyType.VOICE, file_name)
+        except Exception as e:
+            logger.error(e)
+            reply = Reply(ReplyType.ERROR, "遇到了一点小问题，请稍后再问我吧")
+        return reply
Author	SHA1	Message	Date
zhayujie	061d8a3a5f	Merge pull request #1488 from yy1781051483/master add xunfei v3.0	2023-11-17 16:29:39 +08:00
zhayujie	374cd5dbb8	feat: support send knowledge base image	2023-11-17 16:27:44 +08:00
zhayujie	5ad53c2b9c	fix: reduce error noise when converting speech to text	2023-11-16 10:54:24 +08:00
zhayujie	a2ec1a063d	fix: typo	2023-11-10 17:16:15 +08:00
zhayujie	e431dbe2df	docs: update readme.md	2023-11-10 17:13:13 +08:00
zhayujie	7218463f9e	docs: update README	2023-11-10 16:06:58 +08:00
zhayujie	aeb09a95b0	fix: image vision temporarily cancel error logging	2023-11-10 14:31:07 +08:00
zhayujie	0c8f292e12	feat: add tts speech model	2023-11-10 10:48:52 +08:00
zhayujie	f001ac6903	feat: add dalle3 gpt-4-turbo model change	2023-11-10 10:11:02 +08:00
zhayujie	db8e506de0	feat: add gpt-4-turbo tokens calc	2023-11-07 23:10:39 +08:00
zhayujie	099f859dd4	fix: limit openai sdk version to prevent compatibility issues	2023-11-07 10:34:46 +08:00
Daydreamer	b7684c1c2b	add xunfei v3.0	2023-10-29 17:38:56 +08:00
zhayujie	058c167f79	docs: trim help cmd	2023-10-27 14:30:33 +08:00
zhayujie	49446d4872	feat: add wenxin 4.0 model	2023-10-27 14:18:55 +08:00
zhayujie	ced560e1e1	Merge pull request #1485 from zhayujie/feat-agent feat: show thought and plugin in agent process	2023-10-27 13:27:38 +08:00
zhayujie	339102c3cd	Merge pull request #1482 from 6vision/master 自定义入群欢迎语和apilot插件	2023-10-27 12:35:11 +08:00
zhayujie	6331350239	Merge branch 'master' into feat-agent	2023-10-27 12:32:35 +08:00
zhayujie	34e06fcbf8	feat: show thought and plugin in agent process	2023-10-27 12:28:34 +08:00
vision	70aac312ff	Merge branch 'zhayujie:master' into master	2023-10-25 21:12:48 +08:00
zhayujie	5e00704152	Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat	2023-10-23 21:09:54 +08:00
zhayujie	1a9edb6907	fix: plugin config not exist warning	2023-10-23 21:09:18 +08:00
zhayujie	0c18c3a6dd	docs: update demo vedio	2023-10-19 21:51:57 +08:00
6vision	847bb51ce4	增加Apilot插件	2023-10-19 19:34:36 +08:00
6vision	fa60a5dc63	增加新人入群自定义欢迎语参数	2023-10-19 19:20:41 +08:00
zhayujie	aaed3f9839	fix: ignore system message	2023-10-18 11:14:44 +08:00
zhayujie	701daedf49	feat: multi agent plugin	2023-10-13 15:36:20 +08:00