Fig 9 - uploaded by David Mohaisen
Content may be subject to copyright.
SLDs with .onion queries over time.

SLDs with .onion queries over time.

Source publication
Article
Full-text available
The Tor hidden services, one of the features of the Tor anonymity network, are widely used for providing anonymity to services within the Tor network. Tor uses the .onion pseudo-top-level domain for naming convention and to route requests to these hidden services. The .onion namespace is not delegated to the global domain name system (DNS), and Tor...

Context in source publication

Context 1
... of SLDs: The total number of SLDs that attracted .onion traffic and seen at the root for the observation period grows exponentially, as shown in Figure 9. This trend can be used to precisely extrapolate the number of SLDs to be observed at the root unless the root cause of leakage is addressed. ...

Citations

... This type of information leakage is discussed in detail in [21] and [39]. Mohaisen et al. studied the possibility of observing Tor requests on global DNS infrastructure that could threaten the private location of servers hosting Tor services and names/onion addresses of Tor domains [40]. Their characterization of the leakage indicated high volumes of leakage that were geographically distributed and targeted different types of hidden services. ...
Article
Full-text available
Content on the World Wide Web that is not indexable by standard search engines defines a category called the deep Web. Dark networks are a subset of the deep Web. They provide services of great interest to users who seek online anonymity during their search on the Internet. Tor is the most widely used dark network around the world. It requires unique application layer protocols and authorization schemes to access. The present evidence reveals that in spite of great efforts to investigate Tor, our understanding is limited to the work on either the information or structure of this network. Also, interplay between information and structure that plays an important role in evaluating socio-technical systems including Tor has not been given the attention it deserves. In this article, we review and classify the present work on Tor to improve our understanding of this network and shed light on the new directions to evaluate Tor. The related work can be categorized into proposals that (1) study the security and privacy on Tor, (2) characterize Tor’s structure, (3) evaluate the information hosted on Tor, and (4) review the related work on Tor from 2014 to the present.
... WDNs leak varies from 5% of European countries to 50% of developed countries, which experience a high leak level of 30% [14][15]. These alarming figures highlight the pressing need to take action and improve the efficiency of water distribution systems [16][17]. [18] Previously, researchers developed algorithms to address the problem at hand. However, these algorithms suffer from several limitations, including the need for time-consuming and resource-intensive manual inspections, inaccurate leak detection, and false alarms arising from randomly placed sensors in WDNs [19][20]. ...
... 8) in the CNN to introduce nonlinearity and prevent vanishing gradients. 16. For each data point in the input sequence, do: 17. ...
... The world bank has estimated that almost 48 billion m3 of water is wasted annually by the WDNs, costing water companies 14 billion US dollars [14]- [15]. As per the WDNs study, WDNs leak varies from 5% of European countries to 50% of developed countries, which experience a high leak level of 30% [16]. These alarming figures highlight the pressing need to take action and improve the efficiency of water distribution systems [17]. ...
Conference Paper
Full-text available
Water Distribution Networks (WDNs) suffer from significant water losses due to pipeline leaks, leading to financial losses and exacerbating global water scarcity concerns. This paper aims to improve the accuracy of leak detection and the identification of leakage locations in WDNs. Leakage in WDNs is a global challenge that impacts water utilities economically and environmentally. Traditional methods for leak detection are time-consuming, resource-intensive, and prone to inaccurate results and false alarms due to randomly placed sensors. Detecting concealed background leaks in WDNs is particularly challenging. To address these limitations, this paper proposes an enhanced ensemble supervised Machine Learning (ML) algorithm that integrates Graph Theory (GT), Support Vector Machines (SVM), and Artificial Neural Networks (ANN). By combining multiple ML algorithms, the proposed model considers various factors affecting leak detection and localization, resulting in more accurate and reliable assessments of leak presence and location. The combination of EPANET and MATLAB provides a powerful tool for evaluating the performance of the proposed SVM-ANN-GT algorithm and comparing it with traditional methods. The results of the proposed SVM-ANN-GT algorithm can be used as a benchmark for accuracy and effectiveness. Three algorithms, SVM-ANN-GT, SVM, and ANN, were evaluated using EPANET. The SVM-ANN-GT algorithm achieved the highest average leak detection accuracy of 96%, in the order of performance, then followed by the SVM at 85% and ANN at 80%. The superior performance of SVM-ANN-GT can be attributed to its strategic sensor placement based on graph theory principles, which the other algorithms lacked, which then lead to lower level of accuracy.
... requests are still observed in the global DNS infrastructure. Mohaisen and Ren (2017) explored this phenomenon. By leveraging two large DNS traces, the authors uncovered high volumes of diverse leakage in the DNS infrastructure that was geographically distributed and targets several types of services. ...
Article
Full-text available
As the Internet has transformed into a critical infrastructure, society has become more vulnerable to its security flaws. Despite substantial efforts to address many of these vulnerabilities by industry, government, and academia, cyber security attacks continue to increase in intensity, diversity, and impact. Thus, it becomes intuitive to investigate the current cyber security threats, assess the extent to which corresponding defenses have been deployed, and evaluate the effectiveness of risk mitigation efforts. Addressing these issues in a sound manner requires large-scale empirical data to be collected and analyzed via numerous Internet measurement techniques. Although such measurements can generate comprehensive and reliable insights, doing so encompasses complex procedures involving the development of novel methodologies to ensure accuracy and completeness. Therefore, a systematic examination of recently developed Internet measurement approaches for cyber security must be conducted to enable thorough studies that employ several vantage points, correlate multiple data sources, and potentially leverage past successful techniques for more recent issues. Unfortunately, performing such an examination is challenging, as the literature is highly scattered. In large part, this is due to each research effort only focusing on a small portion of the many constituent parts of the Internet measurement domain. Moreover, to the best of our knowledge, no studies have offered an in-depth examination of this critical research domain in order to promote future advancements. To bridge these gaps, we explore all pertinent facets of utilizing Internet measurement techniques for cyber security, ranging from threats within specific application domains to threats themselves. We provide a taxonomy of cyber security-related Internet measurement studies across two dimensions. One dimension relates to the many vertical layers (and components) of the Internet ecosystem, while the other relates to internal normal functions vs. the negative impact of external parties in the Internet and physical world. A comprehensive comparison of the gathered studies is also offered in terms of measurement technique, scope, measurement size, vantage size, and the analysis approach that was leveraged. Finally, a discussion of the roadblocks to performing effective Internet measurements and possible future research directions is elaborated.
... Toward understanding Tor security and privacy issues, Mohaisen et al. studied the possibility of observing Tor requests at global DNS infrastructure that could threaten the private location of servers hosting Tor services, and name/onion address of Tor domains [18]. McCoy et al. tried to answer how Tor is (mis-)used and what clients and routers contribute to this usage [11]. ...
Article
Full-text available
Tor is the most well-known anonymity network that protects the identity of both content providers and their clients against any tracking on the Internet. The previous research on Tor investigated either the security and privacy concerns or the information and hyperlink structure. However, there is still a lack of knowledge about the information leakage attributed to the links from Tor hidden services to the surface Web. This work addresses this gap by a broad evaluation on: (a) the network of links from Tor to the surface Web, (b) the vulnerability of Tor hidden services against the information leakage, (c) the changes in the overall hyperlink structure of Tor hidden services caused by linking to surface websites, and (d) the type of information and services provided by the domains with significant impact on Tor’s network. The results recover the dark-to-surface network as a single massive, connected component where over 90% of identified Tor hidden services have at least one link to the surface world. We also identify that Tor directories significantly contribute to both communication and information dissemination through the network. Our study is the product of crawling approximately 2 million pages from 23,145 onion seed addresses, over a three-month period.
... A range of works have been conducted to address privacyrelated issues of DNS [122,77,121,78,123,79,120,124,125,126]. For example, Herrmann et al. [77] have explored the emerging threat of third-party DNS resolvers, e.g., Google Public DNS and OpenDNS to online privacy and introduced a lightweight privacy-preserving name resolution service called EncDNS. ...
... names at stub, recursive, and authoritative resolvers to improve it's privacy. Mohaisen and Ren [125] have investigated the leakage of .onion at the A and J DNS root nodes over a longitudinal period of time and have found that .onion leakage is common and persistent at DNS infrastructure. ...
Preprint
Full-text available
The domain name system (DNS) is one of the most important components of today's Internet, and is the standard naming convention between human-readable domain names and machine-routable IP addresses of Internet resources. However, due to the vulnerability of DNS to various threats, its security and functionality have been continuously challenged over the course of time. Although, researchers have addressed various aspects of the DNS in the literature, there are still many challenges yet to be addressed. In order to comprehensively understand the root causes of the vulnerabilities of DNS, it is mandatory to review the various activities in the research community on DNS landscape. To this end, this paper surveys more than 170 peer-reviewed papers, which are published in both top conferences and journals in the last ten years, and summarizes vulnerabilities in DNS and corresponding countermeasures. This paper not only focuses on the DNS threat landscape and existing challenges, but also discusses the utilized data analysis methods, which are frequently used to address DNS threat vulnerabilities. Furthermore, we looked into the DNSthreat landscape from the viewpoint of the involved entities in the DNS infrastructure in an attempt to point out more vulnerable entities in the system.
... into a web browser, the domain name will be mapped, by a set of DNS servers, to the associated IP address, e.g., 1.2.3.4. Almost all Internet services depend on DNS to connect users to hosts by resolving DNS queries in various ways [5]. However, because DNS is an open system, anyone may query publicly-accessible resolvers, called open resolvers. ...
Conference Paper
Open DNS resolvers are resolvers that perform recursive resolution on behalf of any user. They can be exploited by adversaries because they are open to the public and require no authorization to use. Therefore, it is important to understand the state of open resolvers to gauge their potentially negative impact on the security and stability of the Internet. In this study, we conducted a comprehensive probing over the entire IPv4 address space and found that more than 3 million open resolvers still exist in the wild. More importantly, we found that many open resolvers answer queries with the incorrect, even malicious, responses. Contrasting to results obtained in 2013, we found that the number of open resolvers has decreased significantly, while the number of open resolvers providing malicious responses has increased.
... Towards understanding Tor security and privacy issues, Mohaisen et al. studied the possibility of observing Tor requests at global DNS infrastructure that could threaten the private location of servers hosting Tor services, and name/onion address of Tor domains [21]. McCoy et al. tried to answer how Tor is (mis-)used and what clients and routers contribute to this usage [20]. ...
Preprint
Full-text available
Tor is one of the most well-known networks that protects the identity of both content providers and their clients against any tracking or tracing on the Internet. So far, most research attention has been focused on investigating the security and privacy concerns of Tor and characterizing the topic or hyperlink structure of its hidden services. However, there is still lack of knowledge about the information leakage attributed to the linking from Tor hidden services to the surface Web. This work addresses this gap by presenting a broad evaluation of the network of referencing from Tor to surface Web and investigates to what extent Tor hidden services are vulnerable against this type of information leakage. The analyses also consider how linking to surface websites can change the overall hyperlink structure of Tor hidden services. They also provide reports regarding the type of information and services provided by Tor domains. Results recover the dark-to-surface network as a single massive connected component where over 90% of Tor hidden services have at least one link to the surface world despite their interest in being isolated from surface Web tracking. We identify that Tor directories have closest proximity to all other Web resources and significantly contribute to both communication and information dissemination through the network which emphasizes on the main application of Tor as information provider to the public. Our study is the product of crawling near 2 million pages from 23,145 onion seed addresses, over a three-month period.
... This monitoring is often done to detect security risks and information leakage on Tor that can compromise the anonymity of its users and the paths packets take. Mohaisen et al. study the possibility of observing Tor requests at global DNS infrastructure that could threaten the private location of servers hosting Tor sites, such as the name and onion address of Tor domains [20]. McCoy et al. study the clients using and routers that are a part of Tor by collecting data from exit routers [18]. ...
Conference Paper
Full-text available
Tor is among most well-known dark net in the world. It has noble uses, including as a platform for free speech and information dissemination under the guise of true anonymity, but may be culturally better known as a conduit for criminal activity and as a platform to market illicit goods and data. Past studies on the content of Tor support this notion, but were carried out by targeting popular domains likely to contain illicit content. A survey of past studies may thus not yield a complete evaluation of the content and use of Tor. This work addresses this gap by presenting a broad evaluation of the content of the English Tor ecosystem. We perform a comprehensive crawl of the Tor dark web and, through topic and network analysis, characterize the 'types' of information and services hosted across a broad swath of Tor domains and their hyperlink relational structure. We recover nine domain types defined by the information or service they host and, among other findings, unveil how some types of domains intentionally silo themselves from the rest of Tor. We also present measurements that (regrettably) suggest how marketplaces of illegal drugs and services do emerge as the dominant type of Tor domain. Our study is the product of crawling over 1 million pages from 20,000 Tor seed addresses, yielding a collection of over 150,000 Tor pages. The domain structure is publicly available as a dataset at \urlhttps://github.com/wsu-wacs/TorEnglishContent.
... Tor tra c monitoring is another related area of work. is monitoring is o en done to detect security risks and information leakage on Tor that can compromise the anonymity of its users and the paths packets take. Mohaisen et al. study the possibility of observing Tor requests at global DNS infrastructure that could threaten the private location of servers hosting Tor sites, such as the name and onion address of Tor domains [28]. McCoy et al. study the clients using and routers that are a part of Tor by collecting data from exit routers [26]. ...
Preprint
Full-text available
Tor is among most well-known dark net in the world. It has noble uses, including as a platform for free speech and information dissemination under the guise of true anonymity, but may be culturally better known as a conduit for criminal activity and as a platform to market illicit goods and data. Past studies on the content of Tor support this notion, but were carried out by targeting popular domains likely to contain illicit content. A survey of past studies may thus not yield a complete evaluation of the content and use of Tor. This work addresses this gap by presenting a broad evaluation of the content of the English Tor ecosystem. We perform a comprehensive crawl of the Tor dark web and, through topic and network analysis, characterize the types of information and services hosted across a broad swath of Tor domains and their hyperlink relational structure. We recover nine domain types defined by the information or service they host and, among other findings, unveil how some types of domains intentionally silo themselves from the rest of Tor. We also present measurements that (regrettably) suggest how marketplaces of illegal drugs and services do emerge as the dominant type of Tor domain. Our study is the product of crawling over 1 million pages from 20,000 Tor seed addresses, yielding a collection of over 150,000 Tor pages. We make a dataset of the intend to make the domain structure publicly available as a dataset at https://github.com/wsu-wacs/TorEnglishContent.