PRISCILLA RAMOS CARVALHO

Projetos de Pesquisa
Unidades Organizacionais
Cargo

Resultados de Busca

Agora exibindo 1 - 3 de 3
  • Resumo IPEN-doc 24572
    Comparative study of hierarchical clustering
    2017 - CARVALHO, P.R.; MUNITA, C.S.; LAPOLLI, A.L.
    In archaeological studies several analytical techniques are used to study the chemical and mineralogical composition of many materials of archaeological origin, generating a large data set. Thus, the multivariate statistical methods become indispensable for the interpretation of the results. These multivariate techniques, unsupervised and supervised, are accompanied by modern computational programs, which provide visualization and interpretation. Several methods have been used, such as cluster analysis, discriminant analysis, principal component analysis, among others. However, the most used is cluster analysis. The purpose of cluster analysis is to group the samples based on similarity or dissimilarity. The groups are determined in order to obtain homogeneity within the groups and heterogeneity between them. The literature presents many methods for partitioning of data set, and is difficult choose which is the most suitable, since the various combinations of methods based on different measures of dissimilarity can lead to different patterns of grouping and false interpretations. Nevertheless, little effort has been expended in evaluating these methods empirically using an archaeological data set. In this way, the objective of this work is make a comparative study of the different cluster analysis methods and to identify which is the most appropriate. For this, the study was carried out using a data set of the Archaeometric Studies Group from IPEN-CNEN/SP, in which 45 samples of ceramic fragments from three archaeological sites were analyzed by instrumental neutron activation analysis (INAA) which were determinated the mass fraction of 13 elements (As, Ce, Cr, Eu, Fe, Hf, La, Na, Nd, Sc, Sm, Th, U). The methods used for this study were: single linkage, complete linkage, average linkage, centroid and Ward. The comparison was done using the cophenetic correlation coefficient and according these values the average linkage method obtained better results. A script of the statistical program R was created to obtain the cophenetic correlation coefficient. The purpose of this script is to facilitate the statistical study of researchers who do not have much familiarity with statistical programs.Therefore, the researcher can easily check which method is most appropriate for your data set.
  • Artigo IPEN-doc 24038
    Validity studies among hierarchical methods of cluster analysis using cophenetic correlation coefficient
    2017 - CARVALHO, PRISCILLA R.; MUNITA, CASIMIRO S.; LAPOLLI, ANDRE L.
    The literature presents many methods for partitioning of data base, and is difficult choose which is the most suitable, since the various combinations of methods based on different measures of dissimilarity can lead to different patterns of grouping and false interpretations. Nevertheless, little effort has been expended in evaluating these methods empirically using an archaeological data base. In this way, the objective of this work is make a comparative study of the different cluster analysis methods and identify which is the most appropriate. For this, the study was carried out using a data base of the Archaeometric Studies Group from IPEN-CNEN/SP, in which 45 samples of ceramic fragments from three archaeological sites were analyzed by instrumental neutron activation analysis (INAA) which were determinated the mass fraction of 13 elements (As, Ce, Cr, Eu, Fe, Hf, La, Na, Nd, Sc, Sm, Th, U). The methods used for this study were: single linkage, complete linkage, average linkage, centroid and Ward. The validation was done using the cophenetic correlation coefficient and comparing these values the average linkage method obtained better results. A script of the statistical program R with some functions was created to obtain the cophenetic correlation. By means of these values was possible to choose the most appropriate method to be used in the data base.
  • Artigo IPEN-doc 23836