Characterizing unstructured data with the nearest neighbor permutation entropy


Permutation entropy and its associated frameworks are remarkable examples of physics-inspired techniques adept at processing complex and extensive datasets. Despite substantial progress in developing and applying these tools, their use has been predominantly limited to structured datasets such as time series or images. Here, we introduce the k-nearest neighbor permutation entropy, an innovative extension of the permutation entropy tailored for unstructured data, irrespective of their spatial or temporal configuration and dimensionality. Our approach builds upon nearest neighbor graphs to establish neighborhood relations and uses random walks to extract ordinal patterns and their distribution, thereby defining the k-nearest neighbor permutation entropy. This tool not only adeptly identifies variations in patterns of unstructured data, but also does so with a precision that significantly surpasses conventional measures such as spatial autocorrelation. Additionally, it provides a natural approach for incorporating amplitude information and time gaps when analyzing time series or images, thus significantly enhancing its noise resilience and predictive capabilities compared to the usual permutation entropy. Our research substantially expands the applicability of ordinal methods to more general data types, opening promising research avenues for extending the permutation entropy toolkit for unstructured data.