Classify EU Legal Scholarship
This is a set of tools created to classify text snippets from EU legal scholarship, determining whether a listed entity is the grammatical subject (agent) of a sentence or clause.
Basic Usage Description
✔️ VALIDATE Whether Listed Entity Is Subject in Sentence
The tool run_is_sentence_valance_basic() expects a text column containing a sentence or snippet and a column identifying the candidate entity. For each row, the prompt function combines these into a user message. The LLM returns 1 if the entity is the agent/subject of the sentence, or 0 if it is not.
from gptquery.tools.tool_classify_text import run_is_sentence_valance_basic
df = ... # DataFrame with required text columns
df_out = run_is_sentence_valance_basic(df, api_key="your-openai-key")
print(df_out['is_agent'])
# [1, 0, 1, "PROCESSING_ERROR"]
📤 Input/Output Schema
Input Columns (required by the prompt function):
| Column | Type | Description |
|---|---|---|
sentence_text |
str | The sentence or clause to classify |
entity |
str | The candidate entity to check as grammatical subject |
Output Column:
is_agent→1if the entity is the subject/agent of the sentence,0if not,"PROCESSING_ERROR"if the LLM returns an unexpected value,"API_ERROR"on request failure
💾 Example DataFrame
| sentence_text | entity | is_agent |
|---|---|---|
| "The Commission submitted observations on the admissibility of the request." | Commission | 1 |
| "The request was submitted to the Commission by the referring court." | Commission | 0 |
| "Member States must transpose the Directive by 31 December." | Member States | 1 |
| (malformed/empty row) | — | "PROCESSING_ERROR" |
Note: Unlike the extraction tools,
is_agentreturns a scalar integer per row (not a list), and invalid LLM outputs (result not in ['1', '0']) are caught and flagged as"PROCESSING_ERROR"rather than silently coerced.