CSV Module¶
The CSV module provides functionality for loading test sets from CSV files.
load_testset¶
ragpill.csv.testset.load_testset
¶
load_testset(csv_path, evaluator_classes=default_evaluator_classes, skip_unknown_evaluators=False, question_column='Question', test_type_column='test_type', expected_column='expected', tags_column='tags', check_column='check')
Create a Dataset from a CSV file with evaluator configurations.
Each evaluator class must implement a from_csv_line() class method that accepts: - Standard CSV columns: expected, tags, check - Additional CSV columns as **kwargs (passed to evaluator.attributes)
CSV Format
The CSV file should contain the following standard columns:
- Question: The input question/prompt for the test case
- test_type: Name of the evaluator class (must match key in evaluator_classes dict)
- expected, tags, check: Standard evaluator parameters
For detailed descriptions of these parameters, see
ragpill.base.BaseEvaluator.from_csv_line.
Any additional columns (e.g., priority, category, domain) will be: 1. Passed to each evaluator's attributes dict via **kwargs in from_csv_line() 2. If all evaluators for a question have the same value for an attribute, that attribute becomes part of the Test Case metadata and will be visible in MLflow
Global Evaluators
Rows with empty questions are treated as global evaluators and will be added to ALL test cases:
Question,test_type,expected,tags,check
,LLMJudge,true,global,"response is polite"
What is X?,RegexEvaluator,true,factual,"X.*definition"
The LLMJudge evaluator will be added to all cases, including the "What is X?" case.
Custom Attributes
You can add custom columns to track metadata:
Question,test_type,expected,tags,check,priority,category
What is X?,LLMJudge,true,factual,"contains the fact, that x is ...",high,science
What is Y's email?,RegexEvaluator,true,"auth,contacts","y@example.com",low,validation
These custom attributes (priority, category) are automatically: - Available in evaluator.attributes - Promoted to Case metadata if all evaluators share the same value - Visible in MLflow tracking for analysis and filtering
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
csv_path
|
str | Path
|
Path to the CSV file |
required |
evaluator_classes
|
dict[str, type[BaseEvaluator]]
|
Dictionary mapping test_type names to evaluator classes.
Extend default_evaluator_classes with custom evaluators:
|
default_evaluator_classes
|
skip_unknown_evaluators
|
bool
|
If True, skip rows with unknown evaluator types instead of raising an error |
False
|
question_column
|
str
|
Name of the column containing questions (default: 'Question') |
'Question'
|
test_type_column
|
str
|
Name of the column containing evaluator class names (default: 'test_type') |
'test_type'
|
expected_column
|
str
|
Name of the column for expected flag (default: 'expected') |
'expected'
|
tags_column
|
str
|
Name of the column for comma-separated tags (default: 'tags') |
'tags'
|
check_column
|
str
|
Name of the column for evaluator-specific check data (default: 'check') |
'check'
|
Returns:
| Type | Description |
|---|---|
Dataset[str, str, TestCaseMetadata]
|
Dataset with Cases grouped by question, each Case having multiple evaluators |
Example
See Also
ragpill.base.BaseEvaluator.from_csv_line:
Detailed descriptions of standard parameters
ragpill.csv.testset.default_evaluator_classes:
Dict of built-in evaluators
Source code in src/ragpill/csv/testset.py
267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 | |
default_evaluator_classes¶
ragpill.csv.testset.default_evaluator_classes
module-attribute
¶
default_evaluator_classes = {'LLMJudge': LLMJudge, 'WrappedPydanticEvaluator': WrappedPydanticEvaluator, 'RegexInSourcesEvaluator': RegexInSourcesEvaluator, 'RegexInDocumentMetadata': RegexInDocumentMetadataEvaluator, 'LiteralQuoteEvaluator': LiteralQuoteEvaluator, 'HasQuotesEvaluator': HasQuotesEvaluator, 'RegexInOutputEvaluator': RegexInOutputEvaluator}