The paper presents a general methodology to implement a flexible Focused Crawler for investigation purposes, monitoring, and Open Source Intelligence (OSINT). The resulting tool is specifically aimed to fit the operational requirements of law-enforcement agencies and intelligence analyst. The architecture of the semantic Focused Crawler features static flexibility in the definition of desired concepts, used metrics, and crawling strategy; in addition, the method is capable to learn (and adapt to) the analyst's expectations at runtime . The user may instruct the crawler with a binary feedback (yes/no) about the current performance of the surfing process, and the crawling engine progressively refines the expected targets accordingly. The method implementation is based on an existing text-mining environment, integrated with semantic networks and ontologies. Experimental results witness the effectiveness of the adaptive mechanism.
AN ANALYST-ADAPTIVE APPROACH TO FOCUSED CRAWLERS
ZUNINO, RODOLFO;
2013-01-01
Abstract
The paper presents a general methodology to implement a flexible Focused Crawler for investigation purposes, monitoring, and Open Source Intelligence (OSINT). The resulting tool is specifically aimed to fit the operational requirements of law-enforcement agencies and intelligence analyst. The architecture of the semantic Focused Crawler features static flexibility in the definition of desired concepts, used metrics, and crawling strategy; in addition, the method is capable to learn (and adapt to) the analyst's expectations at runtime . The user may instruct the crawler with a binary feedback (yes/no) about the current performance of the surfing process, and the crawling engine progressively refines the expected targets accordingly. The method implementation is based on an existing text-mining environment, integrated with semantic networks and ontologies. Experimental results witness the effectiveness of the adaptive mechanism.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.