Probabilistic data structures to estimate cardinalities and frequencies of massive streams

Posted on July 20, 2016 in BigData • Tagged with LogLog-Count, Count-Min-Sketch, Linear Count, Big Data, Stream Processing • 5 min read

In the following blog post we will introduce three different Big Data algorithms. More specifically, we will learn about probabilistic data structures that allow us to estimate cardinalities and frequencies of elements that originate from a massive stream of data. This blog post is heavily inspired by a the well written article on probabilistic data structures for web analytics and data mining. I will not cover the mathematics behind those data structures, the beforementioned blog post does that much better. And if not, then you should probably consult the original papers.

What is Big Data anyways?

Everybody talks nowadays about Big Data, but what does it mean? For example, if we want to count the number of distinct IP Addresses that a very large web site encounters on each day, we need new approaches. Consider the following straightforward algorithm:

unique_ip_addresses = set()
for ip in stream_of_ip_addresses:
    unique_ip_addresses.add(ip)
    if end_of_day(time):
        print('We got {} distinct ip addresses'.format(len(unique_ip_addresses)))
        unique_ip_addresses = set()

This way of counting distinct elements works fine for millions of visitors. But what happens if a website is visited 10 Billion times a day …


Continue reading

What other package managers are vulnerable to typo squatting attacks?

Posted on June 30, 2016 in Security • Tagged with security, Typosquatting, nuget, cargo • 6 min read

In my last blog post about typosquatting package managers I discussed my findings about attacking the programming language package managers from rubygems.org, PyPi and npmjs.com.

This blog contribution generated quite some interest and people subsequently asked themselves whether other package managers might also be vulnerable to this hybrid attack (typosquatting involves a technical and psychological attack vector). During the time I wrote my thesis, I encountered some other package managers. A very good overview of some of the most recent package managers gives the github showcase page about package managers which is summarized in the table below:

Package Manager Name # of Stars on Github
bower/bower 14257
VundleVim/Vundle.vim 11969
npm/npm 9664
alcatraz/Alcatraz 8936
CocoaPods/CocoaPods 8115
composer/composer 7909
Carthage/Carthage 7160
jordansissel/fpm 6722
componentjs/component 4503
apple/swift-package-manager 4318
wbond/package_control 3018
pypa/pip 2911
chocolatey/chocolatey 2741
Masterminds/glide 2163
tmux-plugins/tpm 1961
Homebrew/brew 1757
rust-lang/cargo 1705
rubygems/rubygems 1547
caolan/jam 1540
volojs/volo 1326
gpmgo/gopm 1027
spmjs/spm 882
atom/apm 690
freshshell/fresh 674
ruslo/hunter 436
ocaml/opam 425
NuGet/Home 367

The obvious question now is: How many of those package managers …


Continue reading

Typosquatting programming language package managers

Posted on June 08, 2016 in Security • Tagged with PyPi, Npmjs.com, rubygems.org, security, Typosquatting • 10 min read

Edit: It seems that the blog post and the thesis caused quite some interest. Please contact me under the following mail address: admin [|[at]|] incolumitas [[|dot|]] com

In this blog post, it is demonstrated how

  • 17000 computers were forced to execute arbitrary code by typosquatting programming language packages/libraries
  • 50% of these installations were conducted with administrative rights
  • Even highly security aware institutions (.gov and .mil hosts) fell victim to this attack
  • a typosquatting attack becomes wormable by mining the command history data of hosts
  • some good defenses against typosquatting package managers might look like

The complete thesis can be downloaded as a PDF.

In the second part of 2015 and the early months of 2016, I worked on my bachelors thesis. In this thesis, I tried to attack programming language package managers such as Pythons PyPi, NodeJS Npmsjs.com and Rubys rubygems.org. The attack does not exploit a new technical vulnerability, it rather tries to trick people into installing packages that they not intended to run on their systems.

DNS Typosquatting

In the domain name system, typosquatting is a well known problem. Typosquatting is the malicious registering of a domain that is lexically similar to another, often highly …


Continue reading

Nebula Wargame walkthrough Level 10-19

Posted on September 29, 2015 in Wargames • Tagged with Linux, Programming, Security, Problem Solving • 21 min read

Walkthrough of nebula wargame from level 10 to level 19


Continue reading

Nebula Wargame walkthrough Level 0-9

Posted on September 28, 2015 in Wargames • Tagged with Linux, Programming, Security, Problem Solving • 6 min read

In this blog post we will walk through the solutions of the levels 0 to 9 of the Nebula wargame, which is hosted on http://exploit-exercises.com. This writeup will force me to memorize commands better and exercise a bit. I fear that this writeup is of no use for other people, since you hopefully want to solve those exercises on your own :)

Level 0 - Finding setuid programs in the filesystem

As the descriptions states you need to find a setuid binary that gets a shell for the flag00 user. We can find setuid executables with a command such as the following:

find / -type f -perm -4000 -user flag00 2>/dev/null

This command suppresses error messages (The 2>/dev/null part redirects error output to /dev/null). Furthermore the -perm -4000 flag is responsible for

All  of  the  permission bits mode are set for the file.  Symbolic modes are accepted in this form, and this is usually the way in which would want to use
them.  You must specify `u', `g' or `o' if you use a symbolic mode.   See the EXAMPLES section for some illustrative examples.

Now execute the found binary and run getflag and you should be …


Continue reading

Solution for wargame natas19

Posted on September 15, 2015 in Php • Tagged with Python, Wargames, Php, Security • 3 min read

Hi everyone

I am still trying to solve wargames on overthewire. Level 19 proofed to be very similar to level 18, where server side code looks something like the following:

<?

$maxid = 640; // 640 should be enough for everyone

function isValidAdminLogin() { /* {{{ */
    if($_REQUEST["username"] == "admin") {
    /* This method of authentication appears to be unsafe and has been disabled for now. */
        //return 1;
    }

    return 0;
}
/* }}} */
function isValidID($id) { /* {{{ */
    return is_numeric($id);
}
/* }}} */
function createID($user) { /* {{{ */
    global $maxid;
    return rand(1, $maxid);
}
/* }}} */
function debug($msg) { /* {{{ */
    if(array_key_exists("debug", $_GET)) {
        print "DEBUG: $msg";
    }
}
/* }}} */
function my_session_start() { /* {{{ */
    if(array_key_exists("PHPSESSID", $_COOKIE) and isValidID($_COOKIE["PHPSESSID"])) {
        if(!session_start()) {
            debug("Session start failed");
            return false;
        } else {
            debug("Session start ok");
            if(!array_key_exists("admin", $_SESSION)) {
                debug("Session was old: admin flag set");
                $_SESSION["admin"] = 0; // backwards compatible, secure
            }
            return true;
        }
    }

    return false;
}
/* }}} */
function print_credentials() { /* {{{ */
    if($_SESSION and array_key_exists("admin", $_SESSION) and $_SESSION["admin"] == 1) {
        print "You are an admin. The credentials for the next level are:";
        print "Username: natas19n";
        print "Password: ";  
    } else {  
        print "You are logged in as a regular user. Login as an admin to retrieve credentials for natas19.";  
    }  
}  
/* }}} */

$showform = true;  
if(my_session_start()) {  
    print_credentials …

Continue reading