Program for Data Augmentation

Overview

This is a set of programs of data augmentation using Japanese Civil code for legal textural entailment (COLIEE Task4).
Programs: DataAugmentation.zip

Description

This is a tool for automatic data augmentation for a system of legal textual entailment by considering logical mismatch of juridical decisions. It generates TSV file (label, question, relevant article) as augmentation data from Japanese Civil code text file.

Requirements

Usage

Programs for data augmentation using Japanese Civil code
  1. Run data augmentation program (augmentation.py).
    $ python augmentation.py --input input_file_name.txt --output output_directory/
    	Argument 
    --input/-i : input file path
    --output/-o: output directory
  2. augmentation data (augmentation.tsv) will be output on the directory that you specified.
Sample input data for data augmentation was made from Civil law article texts (in Japanese) of e-Gov provided by Japanese ministry of justice.