Hazelcast-based inverted indexing.

Summary

  • The project is built using Hazelcast, Java and JSCH

  • This assignment involves two versions of an inverted indexing program using Hazelcast's distributed map. The first version, counts word occurrences in each file, displaying file names and counts. The second version uses remote execution to count word occurrences in local files on each cluster node, aiming to explore Hazelcast's remote execution mechanism and measure performance.

  • This project implements two versions of inverted indexing program using Hazelcast’s distributed map that maintains a database of items, each with key = a file name and value = its text data. One version is InvertedIndexingLocal.java that retrieves each file from the database, counts the occurrences of a given word, and prints out the file name and the number of occurrences. The other version is InvertedIndexingRemote.java that dispatches InvertedIndexingEach.class to each remote cluster node where it counts the occurrences of a given word in only files local to that remote node. The purpose of this assignment is to understand Hazelcast’s mechanism of remote execution and to measure its execution performance.

  • Limitations: Topology Hardcoding: Current implementation hardcodes the cluster topology based on IP addresses, limiting flexibility and scalability. Single-Point-of-Failure: Reliance on a single Hazelcast instance as the central coordination point poses a risk of system failure if it becomes unavailable.

  • .