(2017) Gazetteers for Information Extraction Applications in Construction Safety Management
![](https://share.be.uw.edu/wp-content/uploads/sites/54/2020/09/image025-1024x725.png)
Develop a special purpose gazetteer for construction safety management by extracting specific information from safety documents written in natural language.
Gazetteers, also known as entity dictionaries, can be applied to support many information extraction (IE) applications such as named entity recognition (NER). However, gazetteers are not always available, because they require not only domain knowledge but also human effort during development. Existing gazetteers are also mostly general in nature; they are limited to providing common types of entities such as locations, organizations, person names, etc. These common types of entities cannot accurately reflect the semantics of specific domain knowledge, and their applicability is therefore limited. A useful gazetteer, on the other hand, must be able to indicate the important types of entities (i.e., classes of concepts) and must also contain a sufficient number of entities. This creates the need for domain-specific gazetteers, especially when involving specialized IE tasks. In this paper, the authors take construction safety management as the target domain and propose a semi-automated approach to develop a construction safety gazetteer, aiming to eventually support IE applications for constructions safety management. The proposed approach consists of three steps: (1) applying natural language processing (NLP) techniques to extract important phrases (not limited to single terms, but including bigrams and trigrams) from text resources, (2) defining important types of entities as entity classes for the construction safety domain, and (3) assigning the extracted phrases to the predefined entity classes. The authors also discuss how the proposed methodology could be affordable for domain experts, and the possible scenarios for IE applications for supporting construction safety management.
Related Publication:
N.W., Lin, K.Y., El-Gohary, N., and Hsieh, S.H. (first published online on Jan. 28, 2016) “Evaluating the Strength of Text Classification Categories on Supporting Construction Field Inspections”, Automation in Construction, Vol. 64, 2016, pp. 78 – 88.
Chi, N.W., Lin, K.Y., El-Gohary, N., and Hsieh, S.H., (2017) “Gazetteers for Information Extraction Applications in Construction Safety Management”, Proceedings of the International Workshop on Computing in Civil Engineering (IWCCE) 2017, June 25 – 27, 2017, Seattle. Chi,