From 3c6781d2403301f86d3d8c4bf8e3393e71384802 Mon Sep 17 00:00:00 2001 From: zhayujie Date: Mon, 9 Mar 2026 12:02:52 +0800 Subject: [PATCH] refactor: inline skill-creator reference files into SKILL.md --- skills/openai-image-vision/EXAMPLE.md | 168 --------------- skills/openai-image-vision/README.md | 182 ---------------- skills/skill-creator/LICENSE.txt | 202 ------------------ skills/skill-creator/SKILL.md | 19 +- .../references/output-patterns.md | 82 ------- skills/skill-creator/references/workflows.md | 28 --- 6 files changed, 14 insertions(+), 667 deletions(-) delete mode 100644 skills/openai-image-vision/EXAMPLE.md delete mode 100644 skills/openai-image-vision/README.md delete mode 100644 skills/skill-creator/LICENSE.txt delete mode 100644 skills/skill-creator/references/output-patterns.md delete mode 100644 skills/skill-creator/references/workflows.md diff --git a/skills/openai-image-vision/EXAMPLE.md b/skills/openai-image-vision/EXAMPLE.md deleted file mode 100644 index 295460f4..00000000 --- a/skills/openai-image-vision/EXAMPLE.md +++ /dev/null @@ -1,168 +0,0 @@ -# OpenAI Image Vision - Usage Examples - -## Setup - -Set up your API credentials using the agent's env_config tool: - -```bash -# Set your OpenAI API key -env_config(action="set", key="OPENAI_API_KEY", value="sk-your-api-key-here") - -# Optional: Set custom API base URL (for proxy or compatible services) -env_config(action="set", key="OPENAI_API_BASE", value="https://api.openai.com/v1") -``` - -## Example 1: Analyze a Local Image - -```bash -bash scripts/vision.sh "/path/to/photo.jpg" "What's in this image?" -``` - -**Expected Output:** -```json -{ - "model": "gpt-4.1-mini", - "content": "The image shows a beautiful landscape with mountains in the background and a lake in the foreground. The sky is clear with some clouds, and there are trees along the shoreline.", - "usage": { - "prompt_tokens": 1234, - "completion_tokens": 45, - "total_tokens": 1279 - } -} -``` - -## Example 2: Analyze an Image from URL - -```bash -bash scripts/vision.sh "https://example.com/image.jpg" "Describe this image in detail" -``` - -## Example 3: Extract Text (OCR) - -```bash -bash scripts/vision.sh "document.png" "Extract all text from this image" -``` - -**Use Case:** Extract text from screenshots, scanned documents, or photos of text. - -## Example 4: Identify Objects - -```bash -bash scripts/vision.sh "scene.jpg" "List all objects you can identify in this image" -``` - -## Example 5: Analyze Colors and Composition - -```bash -bash scripts/vision.sh "artwork.jpg" "Describe the color palette and composition of this image" -``` - -## Example 6: Count Items - -```bash -bash scripts/vision.sh "crowd.jpg" "How many people are in this image?" -``` - -## Example 7: Use Different Models - -```bash -# Use gpt-4.1-mini (default, latest mini model) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4.1-mini" - -# Use gpt-4.1 (most capable, best for complex analysis) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4.1" - -# Use gpt-4o-mini (previous mini model) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4o-mini" -``` - -## Example 8: Complex Analysis - -```bash -bash scripts/vision.sh "product.jpg" "Analyze this product image. Describe the product, its features, colors, and suggest what kind of marketing copy would work well for it." -``` - -## Example 9: Safety and Content Moderation - -```bash -bash scripts/vision.sh "content.jpg" "Is there any inappropriate or unsafe content in this image?" -``` - -## Example 10: Technical Analysis - -```bash -bash scripts/vision.sh "diagram.png" "Explain what this technical diagram represents and how it works" -``` - -## Integration with Agent - -When the agent loads this skill, it will be available in the `` section. The agent can use it like: - -```bash -bash "/scripts/vision.sh" "user_uploaded_image.jpg" "What's in this image?" -``` - -The `` will be automatically provided by the skill system. - -## Error Handling Examples - -### Missing API Key -```bash -$ bash scripts/vision.sh "image.jpg" "What is this?" -{"error": "OPENAI_API_KEY environment variable is not set", "help": "Visit https://platform.openai.com/api-keys to get an API key"} -``` - -### File Not Found -```bash -$ bash scripts/vision.sh "nonexistent.jpg" "What is this?" -{"error": "Image file not found", "path": "nonexistent.jpg"} -``` - -### Unsupported Format -```bash -$ bash scripts/vision.sh "file.bmp" "What is this?" -{"error": "Unsupported image format", "extension": "bmp", "supported": ["jpg", "jpeg", "png", "gif", "webp"]} -``` - -### Missing Parameters -```bash -$ bash scripts/vision.sh -{"error": "Image path or URL is required", "usage": "bash vision.sh [model]"} -``` - -## Tips for Best Results - -1. **Be Specific**: Ask clear, specific questions about what you want to know -2. **Image Quality**: Higher quality images generally produce better results -3. **Model Selection**: - - Use `gpt-4.1` for complex analysis requiring highest accuracy - - Use `gpt-4.1-mini` (default) for most tasks - latest mini model with good balance -4. **Text Extraction**: For OCR tasks, ensure text is clearly visible and not too small -5. **Multiple Aspects**: You can ask about multiple things in one question -6. **Context**: Provide context in your question if needed (e.g., "This is a medical scan, what do you see?") - -## Performance Notes - -- **Local Files**: Automatically base64-encoded, adds ~33% size overhead -- **URLs**: Passed directly to API, no encoding overhead -- **Timeout**: 60 seconds for API calls -- **Max Tokens**: 1000 tokens for responses (configurable in script) -- **Rate Limits**: Subject to your OpenAI API plan - -## Supported Image Formats - -✅ JPEG (`.jpg`, `.jpeg`) -✅ PNG (`.png`) -✅ GIF (`.gif`) -✅ WebP (`.webp`) - -❌ BMP, TIFF, SVG, and other formats are not supported - -## Cost Considerations - -Vision API calls cost more than text-only calls because they include image tokens. Costs vary by: -- Model used (gpt-4.1 vs gpt-4.1-mini) -- Image size and resolution -- Length of response - -Check OpenAI's pricing page for current rates: https://openai.com/pricing diff --git a/skills/openai-image-vision/README.md b/skills/openai-image-vision/README.md deleted file mode 100644 index 7f88c9c8..00000000 --- a/skills/openai-image-vision/README.md +++ /dev/null @@ -1,182 +0,0 @@ -# OpenAI Image Vision Skill - -This skill enables image analysis using OpenAI's Vision API (GPT-4 Vision models). - -## Features - -- ✅ Analyze images from local files or URLs -- ✅ Support for multiple image formats (JPEG, PNG, GIF, WebP) -- ✅ Automatic base64 encoding for local files -- ✅ Direct URL passing for remote images -- ✅ Configurable model selection -- ✅ Custom API base URL support -- ✅ Pure bash/curl implementation (no Python dependencies) - -## Quick Start - -1. **Set up API credentials using env_config:** - ```bash - env_config(action="set", key="OPENAI_API_KEY", value="sk-your-api-key-here") - # Optional: custom API base - env_config(action="set", key="OPENAI_API_BASE", value="https://api.openai.com/v1") - ``` - -2. **Analyze an image:** - ```bash - bash scripts/vision.sh "/path/to/photo.jpg" "What's in this image?" - ``` - -3. **Analyze from URL:** - ```bash - bash scripts/vision.sh "https://example.com/image.jpg" "Describe this image" - ``` - ```bash - bash scripts/vision.sh "/path/to/image.jpg" "What's in this image?" - ``` - -3. **Analyze from URL:** - ```bash - bash scripts/vision.sh "https://example.com/image.jpg" "Describe this image" - ``` - -## Usage Examples - -### Basic image analysis -```bash -bash scripts/vision.sh "photo.jpg" "What objects can you see?" -``` - -### Text extraction (OCR) -```bash -bash scripts/vision.sh "document.png" "Extract all text from this image" -``` - -### Detailed description -```bash -bash scripts/vision.sh "scene.jpg" "Describe this scene in detail, including colors, mood, and composition" -``` - -### Using different models -```bash -# Use gpt-4.1-mini (default, latest mini model) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4.1-mini" - -# Use gpt-4.1 (most capable, latest model) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4.1" - -# Use gpt-4o-mini (previous mini model) -bash scripts/vision.sh "image.jpg" "Analyze this" "gpt-4o-mini" -``` - -## Environment Variables - -| Variable | Required | Default | Description | -|----------|----------|---------|-------------| -| `OPENAI_API_KEY` | No* | - | OpenAI API key (preferred) | -| `OPENAI_API_BASE` | No | `https://api.openai.com/v1` | Custom OpenAI API base URL | -| `LINKAI_API_KEY` | No* | - | LinkAI API key (fallback when OPENAI_API_KEY is not set) | -| `LINKAI_API_BASE` | No | `https://api.link-ai.tech` | LinkAI API base URL | - -\* At least one of `OPENAI_API_KEY` or `LINKAI_API_KEY` must be set. OpenAI takes priority when both are configured. - -## Response Format - -Success response: -```json -{ - "model": "gpt-4.1-mini", - "content": "The image shows a beautiful sunset over mountains...", - "usage": { - "prompt_tokens": 1234, - "completion_tokens": 567, - "total_tokens": 1801 - } -} -``` - -Error response: -```json -{ - "error": "Error description", - "details": "Additional information" -} -``` - -## Supported Models - -- `gpt-4.1-mini` (default) - Latest mini model, fast and cost-effective -- `gpt-4.1` - Latest GPT-4 variant, most capable -- `gpt-4o-mini` - Previous generation mini model -- `gpt-4-turbo` - Previous generation turbo model - -## Supported Image Formats - -- JPEG (`.jpg`, `.jpeg`) -- PNG (`.png`) -- GIF (`.gif`) -- WebP (`.webp`) - -## Technical Details - -- **Implementation**: Pure bash script using curl and base64 -- **Timeout**: 60 seconds for API calls -- **Max tokens**: 1000 tokens for responses -- **Image handling**: - - Local files are automatically base64-encoded - - URLs are passed directly to the API - - MIME types are auto-detected from file extensions - -## Error Handling - -The script handles various error cases: -- Missing required parameters -- Missing API key -- File not found -- Unsupported image formats -- API errors -- Network timeouts -- Invalid JSON responses - -## Integration with Agent System - -When loaded by the agent system, this skill will appear in `` with a `` path. Use it like: - -```bash -bash "/scripts/vision.sh" "image.jpg" "What's in this image?" -``` - -The agent will automatically: -- Load environment variables from `~/.cow/.env` -- Provide the correct `` path -- Handle skill discovery and registration - -## Notes - -- Images are sent to OpenAI's servers for processing -- Large images may be automatically resized by the API -- Rate limits depend on your OpenAI API plan -- Token usage includes both the image and text in the prompt -- Base64 encoding increases the size of local images by ~33% - -## Troubleshooting - -**"OPENAI_API_KEY environment variable is not set"** -- Set the environment variable using env_config tool -- Or use the agent's env_config tool - -**"Image file not found"** -- Check the file path is correct -- Use absolute paths or paths relative to current directory - -**"Unsupported image format"** -- Only JPEG, PNG, GIF, and WebP are supported -- Check the file extension matches the actual format - -**"Failed to call OpenAI API"** -- Check your internet connection -- Verify the API key is valid -- Check if custom API base URL is correct - -## License - -Part of the chatgpt-on-wechat project. diff --git a/skills/skill-creator/LICENSE.txt b/skills/skill-creator/LICENSE.txt deleted file mode 100644 index 7a4a3ea2..00000000 --- a/skills/skill-creator/LICENSE.txt +++ /dev/null @@ -1,202 +0,0 @@ - - Apache License - Version 2.0, January 2004 - http://www.apache.org/licenses/ - - TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION - - 1. Definitions. - - "License" shall mean the terms and conditions for use, reproduction, - and distribution as defined by Sections 1 through 9 of this document. - - "Licensor" shall mean the copyright owner or entity authorized by - the copyright owner that is granting the License. - - "Legal Entity" shall mean the union of the acting entity and all - other entities that control, are controlled by, or are under common - control with that entity. For the purposes of this definition, - "control" means (i) the power, direct or indirect, to cause the - direction or management of such entity, whether by contract or - otherwise, or (ii) ownership of fifty percent (50%) or more of the - outstanding shares, or (iii) beneficial ownership of such entity. - - "You" (or "Your") shall mean an individual or Legal Entity - exercising permissions granted by this License. - - "Source" form shall mean the preferred form for making modifications, - including but not limited to software source code, documentation - source, and configuration files. - - "Object" form shall mean any form resulting from mechanical - transformation or translation of a Source form, including but - not limited to compiled object code, generated documentation, - and conversions to other media types. - - "Work" shall mean the work of authorship, whether in Source or - Object form, made available under the License, as indicated by a - copyright notice that is included in or attached to the work - (an example is provided in the Appendix below). - - "Derivative Works" shall mean any work, whether in Source or Object - form, that is based on (or derived from) the Work and for which the - editorial revisions, annotations, elaborations, or other modifications - represent, as a whole, an original work of authorship. For the purposes - of this License, Derivative Works shall not include works that remain - separable from, or merely link (or bind by name) to the interfaces of, - the Work and Derivative Works thereof. - - "Contribution" shall mean any work of authorship, including - the original version of the Work and any modifications or additions - to that Work or Derivative Works thereof, that is intentionally - submitted to Licensor for inclusion in the Work by the copyright owner - or by an individual or Legal Entity authorized to submit on behalf of - the copyright owner. For the purposes of this definition, "submitted" - means any form of electronic, verbal, or written communication sent - to the Licensor or its representatives, including but not limited to - communication on electronic mailing lists, source code control systems, - and issue tracking systems that are managed by, or on behalf of, the - Licensor for the purpose of discussing and improving the Work, but - excluding communication that is conspicuously marked or otherwise - designated in writing by the copyright owner as "Not a Contribution." - - "Contributor" shall mean Licensor and any individual or Legal Entity - on behalf of whom a Contribution has been received by Licensor and - subsequently incorporated within the Work. - - 2. Grant of Copyright License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - copyright license to reproduce, prepare Derivative Works of, - publicly display, publicly perform, sublicense, and distribute the - Work and such Derivative Works in Source or Object form. - - 3. Grant of Patent License. Subject to the terms and conditions of - this License, each Contributor hereby grants to You a perpetual, - worldwide, non-exclusive, no-charge, royalty-free, irrevocable - (except as stated in this section) patent license to make, have made, - use, offer to sell, sell, import, and otherwise transfer the Work, - where such license applies only to those patent claims licensable - by such Contributor that are necessarily infringed by their - Contribution(s) alone or by combination of their Contribution(s) - with the Work to which such Contribution(s) was submitted. If You - institute patent litigation against any entity (including a - cross-claim or counterclaim in a lawsuit) alleging that the Work - or a Contribution incorporated within the Work constitutes direct - or contributory patent infringement, then any patent licenses - granted to You under this License for that Work shall terminate - as of the date such litigation is filed. - - 4. Redistribution. You may reproduce and distribute copies of the - Work or Derivative Works thereof in any medium, with or without - modifications, and in Source or Object form, provided that You - meet the following conditions: - - (a) You must give any other recipients of the Work or - Derivative Works a copy of this License; and - - (b) You must cause any modified files to carry prominent notices - stating that You changed the files; and - - (c) You must retain, in the Source form of any Derivative Works - that You distribute, all copyright, patent, trademark, and - attribution notices from the Source form of the Work, - excluding those notices that do not pertain to any part of - the Derivative Works; and - - (d) If the Work includes a "NOTICE" text file as part of its - distribution, then any Derivative Works that You distribute must - include a readable copy of the attribution notices contained - within such NOTICE file, excluding those notices that do not - pertain to any part of the Derivative Works, in at least one - of the following places: within a NOTICE text file distributed - as part of the Derivative Works; within the Source form or - documentation, if provided along with the Derivative Works; or, - within a display generated by the Derivative Works, if and - wherever such third-party notices normally appear. The contents - of the NOTICE file are for informational purposes only and - do not modify the License. You may add Your own attribution - notices within Derivative Works that You distribute, alongside - or as an addendum to the NOTICE text from the Work, provided - that such additional attribution notices cannot be construed - as modifying the License. - - You may add Your own copyright statement to Your modifications and - may provide additional or different license terms and conditions - for use, reproduction, or distribution of Your modifications, or - for any such Derivative Works as a whole, provided Your use, - reproduction, and distribution of the Work otherwise complies with - the conditions stated in this License. - - 5. Submission of Contributions. Unless You explicitly state otherwise, - any Contribution intentionally submitted for inclusion in the Work - by You to the Licensor shall be under the terms and conditions of - this License, without any additional terms or conditions. - Notwithstanding the above, nothing herein shall supersede or modify - the terms of any separate license agreement you may have executed - with Licensor regarding such Contributions. - - 6. Trademarks. This License does not grant permission to use the trade - names, trademarks, service marks, or product names of the Licensor, - except as required for reasonable and customary use in describing the - origin of the Work and reproducing the content of the NOTICE file. - - 7. Disclaimer of Warranty. Unless required by applicable law or - agreed to in writing, Licensor provides the Work (and each - Contributor provides its Contributions) on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or - implied, including, without limitation, any warranties or conditions - of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A - PARTICULAR PURPOSE. You are solely responsible for determining the - appropriateness of using or redistributing the Work and assume any - risks associated with Your exercise of permissions under this License. - - 8. Limitation of Liability. In no event and under no legal theory, - whether in tort (including negligence), contract, or otherwise, - unless required by applicable law (such as deliberate and grossly - negligent acts) or agreed to in writing, shall any Contributor be - liable to You for damages, including any direct, indirect, special, - incidental, or consequential damages of any character arising as a - result of this License or out of the use or inability to use the - Work (including but not limited to damages for loss of goodwill, - work stoppage, computer failure or malfunction, or any and all - other commercial damages or losses), even if such Contributor - has been advised of the possibility of such damages. - - 9. Accepting Warranty or Additional Liability. While redistributing - the Work or Derivative Works thereof, You may choose to offer, - and charge a fee for, acceptance of support, warranty, indemnity, - or other liability obligations and/or rights consistent with this - License. However, in accepting such obligations, You may act only - on Your own behalf and on Your sole responsibility, not on behalf - of any other Contributor, and only if You agree to indemnify, - defend, and hold each Contributor harmless for any liability - incurred by, or claims asserted against, such Contributor by reason - of your accepting any such warranty or additional liability. - - END OF TERMS AND CONDITIONS - - APPENDIX: How to apply the Apache License to your work. - - To apply the Apache License to your work, attach the following - boilerplate notice, with the fields enclosed by brackets "[]" - replaced with your own identifying information. (Don't include - the brackets!) The text should be enclosed in the appropriate - comment syntax for the file format. We also recommend that a - file or class name and description of purpose be included on the - same "printed page" as the copyright notice for easier - identification within third-party archives. - - Copyright [yyyy] [name of copyright owner] - - Licensed under the Apache License, Version 2.0 (the "License"); - you may not use this file except in compliance with the License. - You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. \ No newline at end of file diff --git a/skills/skill-creator/SKILL.md b/skills/skill-creator/SKILL.md index c0375176..28ff57e6 100644 --- a/skills/skill-creator/SKILL.md +++ b/skills/skill-creator/SKILL.md @@ -210,14 +210,23 @@ After initialization, customize the SKILL.md and add resources as needed. If you When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of the agent to use. Include information that would be beneficial and non-obvious to the agent. Consider what procedural knowledge, domain-specific details, or reusable assets would help another agent instance execute these tasks more effectively. -#### Learn Proven Design Patterns +#### Design Patterns -Consult these helpful guides based on your skill's needs: +**Workflow patterns** — For complex tasks, break operations into sequential steps or conditional branches: -- **Multi-step processes**: See references/workflows.md for sequential workflows and conditional logic -- **Specific output formats or quality standards**: See references/output-patterns.md for template and example patterns +```markdown +# Sequential: list numbered steps with scripts +1. Analyze the form (run analyze_form.py) +2. Create field mapping (edit fields.json) +3. Fill the form (run fill_form.py) -These files contain established best practices for effective skill design. +# Conditional: guide through decision points +1. Determine the modification type: + **Creating new content?** → Follow "Creation workflow" + **Editing existing content?** → Follow "Editing workflow" +``` + +**Output patterns** — When consistent output format matters, provide a template or input/output examples in SKILL.md so the agent can follow the desired style. #### Start with Reusable Skill Contents diff --git a/skills/skill-creator/references/output-patterns.md b/skills/skill-creator/references/output-patterns.md deleted file mode 100644 index 073ddda5..00000000 --- a/skills/skill-creator/references/output-patterns.md +++ /dev/null @@ -1,82 +0,0 @@ -# Output Patterns - -Use these patterns when skills need to produce consistent, high-quality output. - -## Template Pattern - -Provide templates for output format. Match the level of strictness to your needs. - -**For strict requirements (like API responses or data formats):** - -```markdown -## Report structure - -ALWAYS use this exact template structure: - -# [Analysis Title] - -## Executive summary -[One-paragraph overview of key findings] - -## Key findings -- Finding 1 with supporting data -- Finding 2 with supporting data -- Finding 3 with supporting data - -## Recommendations -1. Specific actionable recommendation -2. Specific actionable recommendation -``` - -**For flexible guidance (when adaptation is useful):** - -```markdown -## Report structure - -Here is a sensible default format, but use your best judgment: - -# [Analysis Title] - -## Executive summary -[Overview] - -## Key findings -[Adapt sections based on what you discover] - -## Recommendations -[Tailor to the specific context] - -Adjust sections as needed for the specific analysis type. -``` - -## Examples Pattern - -For skills where output quality depends on seeing examples, provide input/output pairs: - -```markdown -## Commit message format - -Generate commit messages following these examples: - -**Example 1:** -Input: Added user authentication with JWT tokens -Output: -``` -feat(auth): implement JWT-based authentication - -Add login endpoint and token validation middleware -``` - -**Example 2:** -Input: Fixed bug where dates displayed incorrectly in reports -Output: -``` -fix(reports): correct date formatting in timezone conversion - -Use UTC timestamps consistently across report generation -``` - -Follow this style: type(scope): brief description, then detailed explanation. -``` - -Examples help Claude understand the desired style and level of detail more clearly than descriptions alone. diff --git a/skills/skill-creator/references/workflows.md b/skills/skill-creator/references/workflows.md deleted file mode 100644 index a350c3cc..00000000 --- a/skills/skill-creator/references/workflows.md +++ /dev/null @@ -1,28 +0,0 @@ -# Workflow Patterns - -## Sequential Workflows - -For complex tasks, break operations into clear, sequential steps. It is often helpful to give Claude an overview of the process towards the beginning of SKILL.md: - -```markdown -Filling a PDF form involves these steps: - -1. Analyze the form (run analyze_form.py) -2. Create field mapping (edit fields.json) -3. Validate mapping (run validate_fields.py) -4. Fill the form (run fill_form.py) -5. Verify output (run verify_output.py) -``` - -## Conditional Workflows - -For tasks with branching logic, guide Claude through decision points: - -```markdown -1. Determine the modification type: - **Creating new content?** → Follow "Creation workflow" below - **Editing existing content?** → Follow "Editing workflow" below - -2. Creation workflow: [steps] -3. Editing workflow: [steps] -``` \ No newline at end of file