LLM Function Calling完全ガイド：Tool Useパターンからプロダクション設計まで

Function Callingとは？
- 動作フロー
OpenAI Function Calling
- 基本的な使い方
- 並列関数呼び出し（Parallel Tool Calls）
Anthropic Tool Use
エラー処理パターン
- 堅牢な関数実行ループ
Structured OutputとFunction Calling
オープンソースモデルでのFunction Calling
- Ollama + Tool Use
- vLLMでTool Useをサービング
プロダクション設計パターン
ベンチマーク：Function Calling性能比較

Function Callingとは？

Function Calling（Tool Use）は、LLMが外部関数やAPIを呼び出せるようにするメカニズムです。LLM自体がコードを実行するのではなく、どの関数をどの引数で呼び出すべきかを決定し、実際の実行はアプリケーションが担当します。

動作フロー

ユーザー: "ソウルの天気を教えて"
    |
LLM: tool_call(get_weather, city="Seoul")
    |
アプリ: get_weather("Seoul") を実行 -> {"temp": 5, "condition": "晴れ"}
    |
LLM: "ソウルの現在の天気は気温5°C、晴れです。"

OpenAI Function Calling

基本的な使い方

from openai import OpenAI
import json

client = OpenAI()

# ツール定義
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "特定の都市の現在の天気情報を取得します",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "都市名（例：Seoul、Tokyo、New York）"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "温度単位"
                    }
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "商品を検索します",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "検索キーワード"},
                    "category": {"type": "string", "enum": ["electronics", "clothing", "food"]},
                    "max_price": {"type": "number", "description": "最大価格"},
                    "sort_by": {"type": "string", "enum": ["price", "rating", "newest"]}
                },
                "required": ["query"]
            }
        }
    }
]

# 最初の呼び出し
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "ソウルの天気を教えて、傘も検索して"}],
    tools=tools,
    tool_choice="auto"
)

message = response.choices[0].message
print(f"Tool calls: {len(message.tool_calls)}")

# ツール実行と結果の受け渡し
messages = [
    {"role": "user", "content": "ソウルの天気を教えて、傘も検索して"},
    message  # assistantのtool_callメッセージ
]

# 各tool_callの結果を追加
for tool_call in message.tool_calls:
    func_name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    if func_name == "get_weather":
        result = {"temp": 5, "condition": "曇り", "humidity": 65}
    elif func_name == "search_products":
        result = [
            {"name": "折りたたみ傘", "price": 15000, "rating": 4.5},
            {"name": "自動開閉傘", "price": 25000, "rating": 4.8}
        ]
    else:
        result = {"error": "Unknown function"}

    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result, ensure_ascii=False)
    })

# 最終レスポンスの生成
final_response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools
)

print(final_response.choices[0].message.content)

並列関数呼び出し（Parallel Tool Calls）

GPT-4oは複数の関数を同時に呼び出せます：

# "ソウルの天気を教えて、傘も検索して" -> 2つのtool_callが同時に返される
# tool_calls = [
#   {"function": {"name": "get_weather", "arguments": '{"city": "Seoul"}'}},
#   {"function": {"name": "search_products", "arguments": '{"query": "傘"}'}}
# ]

import asyncio

async def execute_tool_calls(tool_calls: list) -> list:
    """並列でツール呼び出しを実行します。"""
    async def execute_one(tc):
        func_name = tc.function.name
        args = json.loads(tc.function.arguments)

        # 実際にはAPI呼び出しなどの非同期処理
        if func_name == "get_weather":
            return await get_weather_async(**args)
        elif func_name == "search_products":
            return await search_products_async(**args)

    results = await asyncio.gather(*[execute_one(tc) for tc in tool_calls])
    return results

Anthropic Tool Use

Anthropic（Claude）のTool Useは若干異なるAPI形式を使用します：

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "特定の都市の現在の天気情報を取得します",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "都市名"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "ソウルの天気はどう？"}]
)

# stop_reasonが"tool_use"の場合
if response.stop_reason == "tool_use":
    tool_use_block = next(
        block for block in response.content
        if block.type == "tool_use"
    )

    # ツールの実行
    result = get_weather(city=tool_use_block.input["city"])

    # 結果の受け渡し
    final_response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=tools,
        messages=[
            {"role": "user", "content": "ソウルの天気はどう？"},
            {"role": "assistant", "content": response.content},
            {
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": tool_use_block.id,
                    "content": json.dumps(result, ensure_ascii=False)
                }]
            }
        ]
    )

エラー処理パターン

堅牢な関数実行ループ

import json
from typing import Callable

class ToolExecutor:
    def __init__(self):
        self.tools: dict[str, Callable] = {}
        self.max_retries = 3

    def register(self, name: str, func: Callable):
        self.tools[name] = func

    async def execute(self, tool_call) -> dict:
        func_name = tool_call.function.name
        try:
            args = json.loads(tool_call.function.arguments)
        except json.JSONDecodeError:
            return {"error": f"Invalid JSON arguments: {tool_call.function.arguments}"}

        if func_name not in self.tools:
            return {"error": f"Unknown function: {func_name}"}

        for attempt in range(self.max_retries):
            try:
                result = await self.tools[func_name](**args)
                return {"success": True, "data": result}
            except TypeError as e:
                return {"error": f"Invalid arguments: {str(e)}"}
            except Exception as e:
                if attempt == self.max_retries - 1:
                    return {"error": f"Failed after {self.max_retries} retries: {str(e)}"}
                await asyncio.sleep(2 ** attempt)

    async def run_conversation(self, client, messages, tools_spec):
        """会話ループを実行します。"""
        while True:
            response = client.chat.completions.create(
                model="gpt-4o",
                messages=messages,
                tools=tools_spec
            )

            choice = response.choices[0]

            if choice.finish_reason == "stop":
                return choice.message.content

            if choice.finish_reason == "tool_calls":
                messages.append(choice.message)

                for tc in choice.message.tool_calls:
                    result = await self.execute(tc)
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tc.id,
                        "content": json.dumps(result, ensure_ascii=False)
                    })

Structured OutputとFunction Calling

from pydantic import BaseModel, Field

class WeatherResponse(BaseModel):
    city: str = Field(description="都市名")
    temperature: float = Field(description="現在の気温")
    condition: str = Field(description="天気の状態")
    recommendation: str = Field(description="服装のおすすめ")

# PydanticモデルをJSON Schemaに変換
def model_to_tool(model_class, name: str, description: str) -> dict:
    return {
        "type": "function",
        "function": {
            "name": name,
            "description": description,
            "parameters": model_class.model_json_schema()
        }
    }

weather_tool = model_to_tool(
    WeatherResponse,
    "format_weather",
    "天気情報を構造化された形式で返します"
)

オープンソースモデルでのFunction Calling

Ollama + Tool Use

import ollama

response = ollama.chat(
    model='llama3.1',
    messages=[{'role': 'user', 'content': 'ソウルの天気を教えて'}],
    tools=[{
        'type': 'function',
        'function': {
            'name': 'get_weather',
            'description': '天気情報の取得',
            'parameters': {
                'type': 'object',
                'properties': {
                    'city': {'type': 'string', 'description': '都市名'}
                },
                'required': ['city']
            }
        }
    }]
)

if response['message'].get('tool_calls'):
    for tool_call in response['message']['tool_calls']:
        print(f"Function: {tool_call['function']['name']}")
        print(f"Args: {tool_call['function']['arguments']}")

vLLMでTool Useをサービング

# vLLMサーバーの起動
# vllm serve meta-llama/Llama-3.1-8B-Instruct \
#   --enable-auto-tool-choice \
#   --tool-call-parser hermes

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

response = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "ソウルの天気を教えて"}],
    tools=tools,
    tool_choice="auto"
)

プロダクション設計パターン

ツール選択の制御

# 特定の関数を強制呼び出し
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice={"type": "function", "function": {"name": "get_weather"}}
)

# 関数呼び出しを無効化
response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
    tool_choice="none"
)

権限ベースのツールフィルタリング

class ToolRegistry:
    def __init__(self):
        self.tools = {}
        self.permissions = {}

    def register(self, name: str, func: Callable, spec: dict, required_role: str = "user"):
        self.tools[name] = func
        self.permissions[name] = required_role

    def get_tools_for_role(self, role: str) -> list:
        role_hierarchy = {"admin": 3, "operator": 2, "user": 1}
        user_level = role_hierarchy.get(role, 0)

        return [
            {"type": "function", "function": spec}
            for name, spec in self.tools.items()
            if role_hierarchy.get(self.permissions[name], 0) <= user_level
        ]

トークン最適化

def optimize_tool_result(result: dict, max_chars: int = 2000) -> str:
    """ツール結果をトークン効率的に変換します。"""
    result_str = json.dumps(result, ensure_ascii=False)

    if len(result_str) <= max_chars:
        return result_str

    # 大きな結果は要約
    if isinstance(result, list) and len(result) > 10:
        return json.dumps({
            "total_count": len(result),
            "showing": "first 10",
            "items": result[:10],
            "note": f"全{len(result)}件中、上位10件のみ表示"
        }, ensure_ascii=False)

    return result_str[:max_chars] + "... (truncated)"

ベンチマーク：Function Calling性能比較

モデル	単一呼び出し精度	並列呼び出し精度	引数パース精度
GPT-4o	97.2%	94.5%	98.1%
Claude 3.5 Sonnet	96.8%	93.2%	97.5%
Llama 3.1 70B	91.5%	85.3%	93.2%
Llama 3.1 8B	84.2%	72.1%	88.7%

確認クイズ（6問）

Q1. Function CallingにおけるLLMの役割は？

LLMはどの関数をどの引数で呼び出すべきかを決定します。実際の実行はアプリケーションが担当します。

Q2. OpenAIとAnthropicのFunction Calling APIの最大の違いは？

OpenAIはtool_calls/toolメッセージを使用し、Anthropicはcontentブロック内のtool_use/tool_resultタイプを使用します。

Q3. 並列関数呼び出し（Parallel Tool Calls）が有用な場合は？

互いに独立した複数のタスクを同時にリクエストする場合（例：天気照会 + 商品検索）、レイテンシを削減できます。

Q4. tool_choiceパラメータの3つの値とその意味は？

auto（LLMが判断）、none（関数呼び出し無効化）、特定の関数指定（強制呼び出し）

Q5. オープンソースモデルでFunction Callingをサポートするために必要な設定は？

vLLMで--enable-auto-tool-choiceと--tool-call-parserオプションを有効化するか、Ollamaでtoolsパラメータを使用します。

Q6. ツール結果が長すぎる場合のトークン最適化戦略は？

結果を切り詰める（truncation）か、リストの場合は上位N件のみ返し、全体件数をメタデータとして含めます。