SEMINÁRIO: Big Data Analytics: Applications and Challenges
Profª Sílvia Pedro Rebouças - Faculdade Economia, Administração, Atuária e Contabilidade - Universidade Federal do Ceará, Brasil
- FCUL - Campo Grande - Bloco C6 Piso 4 - Sala: 6.4.30 - 14:00h
- Sexta-feira, 23 de Junho de 2017
- Referência Projeto: Projecto FCT: UID/MAT/00006/2013
Advances in information technologies have led to the storage of large amounts of data by organizations. An analysis of this data through data mining techniques, also called big data analytics, is an important support for decision-making. In this seminar, will be presented two applications: one for structured data and other for unstructured data in text format. The first aims to classify the beneficiaries of an operator of health insurance in Brazil, according to their financial sustainability, via their sociodemographic characteristics and their healthcare cost history. Beneficiaries with a loss ratio greater than 0.75 were considered unsustainables. The sample consisted of 38875 beneficiaries, active between the years 2011 and 2013. The techniques used were logistic regression, which presented the best performance (with an accuracy rate of 68.43%), and classification trees. Age and the type of plan were the most important variables related to the profile of the beneficiaries in the classification. The highlights with regard to healthcare costs were annual spending on consultation and on dental insurance. In the second application, the goal is to develop a tool to evaluate the organization's stakeholder engagement, disclosed in sustainability reports, using a text mining approach. In order to achieve this goal, reports of Brazilian companies that were published under the G4 guidelines of the Global Reporting Initiative during 2016 were used. The results showed that the most mentioned stakeholders were employees, clients, market and suppliers, and the major concern raised by them relates mainly to the environment. Two clusters with different patterns of stakeholder engagement were found, but the firm-characteristics did not differ significantly between them. The proposed method facilitate pattern recognition in texts, eliminating the need of time-consuming techniques, such as, content analysis, that are usually used in the analysis of reports.