Eng During last 365 days Approved articles: 1982,   Articles in work: 307 Declined articles: 755 
Library
Articles and journals | Tariffs | Payments | Your profile

Back to contents

Experimental comparison of clustering algorithms in the problem of lightning data grouping
Belikova Marina Yur'evna

Senior Lecturer of the Department of Physics and Informatics, Gorno-Altai State University

649000, Russia, respublika Altai, g. Gorno-Altaisk, ul. Lenkina, 1

BelikovaMY@yandex.ru

 

 
Karanina Svetlana Yur'evna

PhD in Physics and Mathematics

Associate Professor, Department of Physics and Informatics, Gorno-Altai State University

649000, Russia, respublika Altai, g. Gorno-Altaisk, ul. Lenkina, 1

krechetovas@yandex.ru

 

 
Glebova Alena Viktorovna

Senior Lecturer, Department of Physics and Informatics, Gorno-Altai State University

649000, Russia, respublika Altai, g. Gorno-Altaisk, ul. Lenkina, 1

glebova-alena-1991@yandex.ru

 

 

Abstract.

The authors present the results of an experimental comparison of the cluster analysis of thunderstorm data using the algorithms of k-means, dbscan and hierarchical agglomerative algorithms, where closest neighbor, full and medium coupling methods and the Ward method are used to calculate the intercluster distance. The influence of the normalization parameters on the number of clusters determined by the algorithms under consideration on the test sample is estimated. Data on the time of registration and the coordinates of lightning discharges recorded by the World Wide Lightning Location Network (WWLLN) were used for test purposes. The construction of grouping solutions by the chosen clustering algorithms was carried out with the help of the Nbclust, dbscan, and fpc cluster analysis packages developed in the R language. The article showns that the choice of the values of the normalization parameters has a significant effect on the number of clusters allocated from the sample under consideration using hierarchical clustering algorithms (especially for method of the nearest neighbor). The choice of the normalizing parameters has practically no effect or has a negligible effect on the results of lightning cluster clustering using the k-means and dbscan algorithms. The best agreement with expert judgment was obtained for the dbscan algorithm with normalizing parameters corresponding to linear dimensions of a thunderstorm convective cell of 100 km and a period of time of 30 minutes to an hour.

Keywords: cluster validity, hierarchical algorithms, dbscan, k-means, clustering algorithms, data mining, average silhouette width (asw), data normalization, lightning, WWLLN

DOI:

10.25136/2306-4196.2018.1.25261

Article was received:

23-01-2018


Review date:

24-01-2018


Publish date:

26-01-2018


This article written in Russian. You can find full text of article in Russian here .

References
1.
Kononov I.I., Yusupov I.E. Klasternyi analiz grozovoi aktivnosti // Radiotekh-nika i elektronika, 2004, tom 49, 3, C.283291.
2.
Adzhieva A.A, Shapovalov V.A. Klasternyi analiz v avtomaticheskom vyyavlenii i soprovozhdenii grozovykh ochagov po dannym grozopelengatsionnoi seti // Inzhe-nernyi vestnik Dona. 2016. 2 URL: ivdon.ru/ru/magazine/archive/n2y2016/3559
3.
Mezuman, K., C. Price, and E. Galanti, 2014: On the spatial and temporal distribution of global thunderstorm cells. Environ. Res. Lett., 9, 124023, DOI: https://doi.org/10.1088/1748-9326/9/12/124023.
4.
Bogushov A.K., Panyukov A.V. Razmeshchenie vzaimosvyazannykh ob''ektov v usloviyakh neopredelennosti // IV Vserossiiskaya konferentsiya Problemy optimizatsii i ekonomicheskie prilozheniya : Materialy konferentsii (Omsk, 29 iyunya 4 iyulya, 2009) / Omskii filial Instituta matematiki SO RAN. Omsk: Poligraf. Tsentr KAN, 2009. S.113.
5.
Hutchins M L, Holzworth R H and Brundell J B 2014 Diurnal variation of the global electric circuit from clustered thunderstorms J. Geophys. Res. : Space Phys. 199 6209
6.
Shabaganova S.N., Karimov R.R., Kozlov V.I., Mullayarov V.A. Characteristics of storm cells from observations in Yakutia. Russ. Meteorol. Hydrol. 2013. Vol. 37, No. 1112. P. 746751. DOI: 10.3103/S1068373912110088
7.
Mareev E.A., Stasenko V.N., Bulatov A.A., Dement'eva S.O., Evtushenko A.A., Il'in N.V., Kuterin F.A., Slyunyaev N.N., Shatalina M.V. Rossiiskie issledova-niya atmosfernogo elektrichestva v 20112014 gg Izvestiya Rossiiskoi akademii nauk. Fizika atmosfery i okeana. 2016. T. 52. 2. S. 175.
8.
Kovács, F., Legány, C., & Babos, A. (2005) Cluster validity measurement techniques. Proceedings of the 6th International Symposium of Hungarian Researchers on Computa-tional Intelligence, Budapest, Nov. 2005, 18-19. URL: http://uni-obuda.hu/conferences/mtn2005/KovacsFerenc.pdf
9.
Rousseeuw P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 5365 (1987).
10.
Jain, A.K. Data clustering: 50 years beyond K-means // Pattern recognition letters. 2010. Vol. 31. No. 8. P. 651-666.
11.
A density-based algorithm for discovering clusters in large spatial database / M. Ester, H.-P. Kriegel, J. Sander, X. Xu // Proc. 1996 Intern. Conf. on Knowledge Discovery and Data Mining.-1996.-P. 226-231.
12.
Xu, R. Survey of clustering algorithms / R. Xu, D. Wunsch // IEEE Transactions, Neural Networks. 2005. Vol. 16. No. 3. P. 645-678.
13.
Belikova Marina Yur'evna, Krechetova Svetlana Yur'evna, Perelygin Anton Aleksandrovich Metody i rezul'taty klasterizatsii dannykh po grozovym razrya-dam // Izvestiya AltGU. 2016. 1 (89). URL: http://izvestia.asu.ru/ru/article/842/.
14.
Dowden, R. L., J. B. Brundell, and C. J. Rodger, VLF lightning location by time of group arrival (TOGA) at multiple sites, J. Atmos. Solar.-Terr. Phys., 2002, Vol. 64, No. 7, pp. 817830.
15.
Platnick, S., K. Meyer, M. D. King, G. Wind, N. Amarasinghe, B. Marchant, G. T. Ar-nold, Z. Zhang, P. A. Hubanks, R. E. Holz, P. Yang, W. L. Ridgway, and J. Riedi, 2017: The MODIS cloud optical and microphysical products: Collection 6 updates and exam-ples from Terra and Aqua. IEEE Trans. Geosci. Remote Sens., 55, 502-525, DOI: 10.1109/TGRS.2016.2610522.
16.
European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) URL: http://eumetview.eumetsat.int/mapviewer/.
17.
Charrad M., Ghazzali N., Boiteau V., Niknafs A. (2014). "NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set.", "Journal of Statistical Software, 61(6), 1-36.", "URL http://www.jstatsoft.org/v61/i06/".
18.
Michael Hahsler, Matthew Piekenbrock, Sunil Arya, David Mount dbscan: Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms https://CRAN.R-project.org/package=dbscan.
19.
Christian Hennig fpc: Flexible Procedures for Clustering https://CRAN.R-project.org/package=fpc.
20.
Hadi Fanaee Tork. 2012. Spatio-temporal clustering methods classification. In Doctoral Symposium on Informatics Engineering. 199-209.
21.
Zagoruiko N.G. Intellektual'nyi analiz dannykh, osnovannyi na funktsii kon-kurentnogo skhodstva // Avtometriya, 2008. Tom 44, 3, S. 31-40.