dc.contributor.author | Lukšová, Ivana |
dc.contributor.author | Hladká, Barbora |
dc.date.accessioned | 2015-10-13T17:18:52Z |
dc.date.available | 2015-10-13T17:18:52Z |
dc.date.issued | 2015-10-13 |
dc.identifier.uri | http://hdl.handle.net/11234/1-1515 |
dc.description | Environmental impact assessment (EIA) is the formal process used to predict the environmental consequences of a plan. We present a rule-based extraction system to mine Czech EIA documents. The extraction rules work with a set of documents enriched with morphological information and manually created vocabularies of terms supposed to be extracted from the documents, e.g. basic information about the project (address, ID company, ...), data on the impacts and outcomes (waste substances, endangered species, ...), a final opinion. The documents Notice of Intent contains the section BI2 with the information on the scope (capacity) of the plan. |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.rights | GNU General Public Licence, version 3 |
dc.rights.uri | http://opensource.org/licenses/GPL-3.0 |
dc.source.uri | http://ufal.mff.cuni.cz/grants/intelligent-library |
dc.subject | information extraction |
dc.subject | rule-based extraction |
dc.title | Information extraction from EIA documents |
dc.type | toolService |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
metashare.ResourceInfo#ContentInfo.detailedType | tool |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Ivana Lukšová luksova@ufal.mff.cuni.cz Charles University in Prague, UFAL |
sponsor | Technologická agentura České republiky TA02010182 Inteligentní knihovna - INTLIB nationalFunds |
files.size | 1801300 |
files.count | 1 |
Soubory tohoto záznamu
- Název
- intlib_eia_app.zip
- Velikost
- 1.72 MB
- Formát
- application/zip
- Popis
- Extraction system sources and rules
- MD5
- 3bdefa5bb3cfa815886bba571095dcd6
- intlib_eia_app
- stanoviskoNatura.cmd224 B
- src
- EIA_analysis
- pom.xml2 kB
- .project772 B
- .settings
- org.eclipse.jdt.core.prefs243 B
- org.eclipse.core.resources.prefs191 B
- .classpath1 kB
- src
- main
- java
- czsem
- gate
- cz
- intlib
- eia
- util
- EIAUtil.java703 B
- processing
- LinkEIAAnnotationPR.java5 kB
- EIANazevExtraction.java3 kB
- EIADictionariesFilter.java11 kB
- EIAHeadingsFeatureExtractor.java2 kB
- EIAHeadersFilter.java4 kB
- EIASectionBuilder.java3 kB
- EIAAnalysisConfig.java1 kB
- io
- EIAXMLExporter.java34 kB
- DataStoreImporter.java1 kB
- analysis
- SectionsDetection.java6 kB
- TreexAnalysis.java1 kB
- EntityDetection.java6 kB
- endtoend
- EIAMainClass.java937 B
- OznameniAnalysis.java5 kB
- StanoviskoNaturaAnalysis.java2 kB
- StanoviskoAnalysis.java2 kB
- util
- eia
- intlib
- resources
- eia_request_v1.3_template.xml1 kB
- eia_statement_v1.0_template.xml624 B
- java
- test
- main
- EIA_analysis
- resources
- eia
- feature_aggregator
- subsections_aggr.txt150 B
- main_sections_aggr.txt449 B
- gazetteer
- Oznameni_dictionary_cs.def404 B
- Cislovky_slovne2.lst2 kB
- Prirodni_oblasti.lst517 kB
- Odpady_kody.lst85 kB
- Kategorizace.lst34 B
- Rozhodnuti.lst21 kB
- Stavy.lst18 kB
- Odpady.lst807 kB
- typy_posNATURA.def57 B
- odpad_nazvy.lst167 kB
- typy_stanoviska.lst3 kB
- Oznameni_headers.def47 B
- BI2_entities_cs.def175 B
- Skodlive_latky_cs.lst449 B
- Terminy.lst14 kB
- metrika_before.lst602 B
- Casy.lst3 kB
- Oznameni_headers_cs.def50 B
- Kraje_CR.lst12 kB
- Kraj_obec.lst1 MB
- Skodlive_latky_zjednodusene.lst119 kB
- typy_posNATURA.lst3 kB
- Metrika.lst51 kB
- entities.def49 B
- Sousedni_zeme.lst4 kB
- Kategorizace_zakonna_omezeni_varianty_kat.lst6 kB
- Oznameni_headers.lst9 kB
- Oznameni_dictionary.def385 B
- BI2_entities.def292 B
- Veliciny2.lst32 kB
- Metrika2.lst278 B
- Oznameni_headers_cs.lst371 B
- Odpady_typy2.lst64 B
- Pojmy.lst689 kB
- Ohr+Chr_druhy.lst534 kB
- Pojmy2.lst114 kB
- typy_stanoviska.def52 B
- entities.lst78 kB
- Veliciny.lst83 kB
- Ohrozene_druhy.lst484 kB
- Chranene_druhy_synonyma.lst108 kB
- Prirodni_oblasti_typy_cs.lst325 B
- Skodlive_latky.lst95 kB
- Prirodni_oblasti_typy.lst808 B
- Pojmy_cs.lst153 B
- Odpady_typy.lst160 B
- regexp
- sentence.txt95 B
- common_regexp.txt145 B
- oznameni_regexp.txt252 B
- number_regexp.txt1 kB
- linking
- Velicina_metrika.csv37 kB
- 140717
- Pojem_velicina.csv84 kB
- Pojem_velicina.csv141 kB
- jape
- attribute_entity.jape945 B
- attribute_entity2.jape955 B
- stav1.jape2 kB
- number_unit.jape5 kB
- attribute_number.jape1 kB
- dict_linking.jape6 kB
- entity_count.jape2 kB
- stav2.jape1 kB
- readme.txt441 B
- add_id.jape743 B
- add_id2.jape1 kB
- entity_unit.jape2 kB
- termin_attribute.jape655 B
- feature_aggregator
- eia
- configuration
- eia_config.xml480 B
- czsem_config.xml1 kB
- output
- stanoviskoNatura.sh304 B
- oznameni.cmd260 B
- generator
- EIAdokumentace.pdf743 kB
- lib
- readme.txt68 B
- stanovisko.sh286 B
- input
- EIAdokumentace.odt327 kB
- oznameni.sh433 B
- datastores
- stanovisko.cmd205 B