Kademlia Documentation¶
Note
This library assumes you have a working familiarity with asyncio.
This library is an asynchronous Python implementation of the Kademlia distributed hash table. It uses asyncio to provide asynchronous communication. The nodes communicate using RPC over UDP to communiate, meaning that it is capable of working behind a NAT.
This library aims to be as close to a reference implementation of the Kademlia paper as possible.
Usage¶
To start a new network, create the first node. Future nodes will connect to this first node (and any other nodes you know about) to create the network.
import logging
import asyncio
from kademlia.network import Server
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
log = logging.getLogger('kademlia')
log.addHandler(handler)
log.setLevel(logging.DEBUG)
loop = asyncio.get_event_loop()
loop.set_debug(True)
server = Server()
loop.run_until_complete(server.listen(8468))
try:
loop.run_forever()
except KeyboardInterrupt:
pass
finally:
server.stop()
loop.close()
Here’s an example of bootstrapping a new node against a known node and then setting a value:
import logging
import asyncio
import sys
from kademlia.network import Server
if len(sys.argv) != 5:
print("Usage: python set.py <bootstrap node> <bootstrap port> <key> <value>")
sys.exit(1)
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
log = logging.getLogger('kademlia')
log.addHandler(handler)
log.setLevel(logging.DEBUG)
async def run():
server = Server()
await server.listen(8469)
bootstrap_node = (sys.argv[1], int(sys.argv[2]))
await server.bootstrap([bootstrap_node])
await server.set(sys.argv[3], sys.argv[4])
server.stop()
asyncio.run(run())
Note
You must have at least two nodes running to store values. If a node tries to store a value and there are no other nodes to provide redundancy, then it is an exception state.
Fidelity to Original Paper¶
The current implementation should be an accurate implementation of all aspects of the paper except one - in Section 2.3 there is the requirement that the original publisher of a key/value republish it every 24 hours. This library does not do this (though you can easily do this manually).
Querying the DHT¶
If you just want to query the network, you can use the example query script. For instance:
$ python examples/get.py 1.2.3.4 8468 SpecialKey
The query script is simple:
import logging
import asyncio
import sys
from kademlia.network import Server
if len(sys.argv) != 4:
print("Usage: python get.py <bootstrap node> <bootstrap port> <key>")
sys.exit(1)
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
log = logging.getLogger('kademlia')
log.addHandler(handler)
log.setLevel(logging.DEBUG)
async def run():
server = Server()
await server.listen(8469)
bootstrap_node = (sys.argv[1], int(sys.argv[2]))
await server.bootstrap([bootstrap_node])
result = await server.get(sys.argv[3])
print("Get result:", result)
server.stop()
asyncio.run(run())
Check out the examples folder for other examples.
Kademlia API¶
Kademlia is a Python implementation of the Kademlia protocol which utilizes the asyncio library.
The best place to start is the examples folder before diving into the API.
kademlia package¶
The best place to start is the examples folder before diving into the API.
kademlia.crawling module¶
-
class
kademlia.crawling.
NodeSpiderCrawl
(protocol, node, peers, ksize, alpha)[source]¶ Bases:
kademlia.crawling.SpiderCrawl
Create a new C{SpiderCrawl}er.
- Parameters
-
class
kademlia.crawling.
RPCFindResponse
(response)[source]¶ Bases:
object
A wrapper for the result of a RPC find.
- Parameters
response – This will be a tuple of (<response received>, <value>) where <value> will be a list of tuples if not found or a dictionary of {‘value’: v} where v is the value desired
-
class
kademlia.crawling.
SpiderCrawl
(protocol, node, peers, ksize, alpha)[source]¶ Bases:
object
Crawl the network and look for given 160-bit keys.
Create a new C{SpiderCrawl}er.
-
class
kademlia.crawling.
ValueSpiderCrawl
(protocol, node, peers, ksize, alpha)[source]¶ Bases:
kademlia.crawling.SpiderCrawl
Create a new C{SpiderCrawl}er.
- Parameters
kademlia.network module¶
kademlia.node module¶
-
class
kademlia.node.
Node
(node_id, ip=None, port=None)[source]¶ Bases:
object
Simple object to encapsulate the concept of a Node (minimally an ID, but also possibly an IP and port if this represents a node on the network). This class should generally not be instantiated directly, as it is a low level construct mostly used by the router.
Create a Node instance.
- Parameters
node_id (int) – A value between 0 and 2^160
ip (string) – Optional IP address where this Node lives
port (int) – Optional port for this Node (set when IP is set)
-
class
kademlia.node.
NodeHeap
(node, maxsize)[source]¶ Bases:
object
A heap of nodes ordered by distance to a given node.
Constructor.
@param node: The node to measure all distnaces from. @param maxsize: The maximum size that this heap can grow to.
-
remove
(peers)[source]¶ Remove a list of peer ids from this heap. Note that while this heap retains a constant visible size (based on the iterator), it’s actual size may be quite a bit larger than what’s exposed. Therefore, removal of nodes may not change the visible size as previously added nodes suddenly become visible.
-
kademlia.protocol module¶
kademlia.routing module¶
-
class
kademlia.routing.
KBucket
(rangeLower, rangeUpper, ksize, replacementNodeFactor=5)[source]¶ Bases:
object
kademlia.storage module¶
-
class
kademlia.storage.
ForgetfulStorage
(ttl=604800)[source]¶ Bases:
kademlia.storage.IStorage
By default, max age is a week.
kademlia.utils module¶
General catchall for functions that don’t make sense as methods.
Find the shared prefix between the strings.
For instance:
sharedPrefix([‘blahblah’, ‘blahwhat’])
returns ‘blah’.