More about rotations in 4 dimensions.

It is easy to take a quaternion and recognize the rotation it actually represents. This is about doing the same thing for 4 dimensional rotations. It is easy to multiply two four dimensional rotations together. Can we extract the orthogonal planes of rotation and their angles of rotations from that. We can represent a rotation as a product of two vectors $ab$ via $w\ mapsto ba v ab$ and this will be the rotation of the $ab$ plane which leaves the space orthogonal to that plane pointwise fixed. It will be the rotations from the vector $a$ to the vector $b$ applied twice. This specifies both the direction and that amount of the rotation. Since $ab = g(a,b) + a\wedge b$ then this rotation is $\cos(\alpha/2) + \sin(\alpha/2) \omega$ where $\alpha$ is the amount of rotation and $\omega$ is $a \wedge b$ scaled to have norm one by a positive scalar constant. The product $a\wedge b$ specifies the direction in which the rotation $\alpha$ is applied, since any vectors $c\wedge d$ which equals $a \wedge b$ specifies the direction of the rotation as it is a rotation of angle $\alpha$ rotating starting from $c$ and moving toward $d$. We can extract the plane $a,b$ from the product $a\wedge b$ because it is those vectors $u$ with $u\wedge (a\wedge b) = 0$. This is a system of 4 homogeneous equations which has rank 2 over $\mathbb{R}^4$. We can extract the orthogonal plane $\phi = c \wedge d$ to the $a,b$ plane from $\omega$ alone. Write $\omega = r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4$ where $r_{12}r_{34} - r_{13}r_{24} + r_{14}r_{23} = 0$. Then the orthogonal plane is $\psi = r_{34}e_1e_2 - r_{24} e_1e_3 + r_{23} e_1e_4 + r_{14}e_2e_3 - r_{13}e_2e_4 + r_{12}e_3e_4$. Then $\omega \psi = (r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4) (r_{34}e_1e_2 - r_{24} e_1e_3 + r_{23} e_1e_4 + r_{14}e_2e_3 - r_{13}e_2e_4 + r_{12}e_3e_4)$. The constant term is $-r_{12}r_{34} + r_{13}r_{24} - r_{14}r_{23} - r_{23}r_{14} + r_{24}r_{13} - r_{34}r_{12} = 0$. The $e_1e_2e_3e_4$ term is $r_{12}^2 + r_{13}^2 + r_{14}^2 + r_{23}^2 + r_{24}^2 + r_{34}^2$. The $e_1e_2$ term of $\omega\psi$ is $r_{13}r_{14} e_1e_3e_2e_3 - r_{14}r_{13}e_1e_4e_2e_4 - r_{23}r_{24}e_2e_3 e_1e_3 + r_{24}r_{23} e_2e_4 e_1e_4$ This simplifies to zero. The other grade 2 terms should simply to zero as well. Then $(a + b \omega) ( c + d \psi) = ac + (bc \omega + ad \psi) + bd \omega\psi$. So if $\omega$ and $\psi$ were norm one then $\omega\psi = e_1e_2e_3e_4$. Thus our product becomes $ac + (bc \omega + ad\psi) + bd e_1e_2e_3e_4$. If $a = \cos(\alpha/2)$ and $b=\sin(\alpha/2)$ and if $c=\cos(\beta/2)$ and $d = \sin(\beta/2)$ then $ac = \cos(\alpha/2)\cos(\beta/2)$ and $bd = \sin(\alpha/2)\sin(\beta/2)$ and consequently $ac - bd = \cos((\beta - \alpha)/2)$ and $ac + bd = \cos((\alpha + \beta)/2)$. Since $ac$ and $bd$ are known (being the constant and the volume coefficients) then we can solve this for $\alpha$ and $\beta$. Now $ad + bc = \sin((\alpha + \beta)/2)$ And $ad - bc = \sin((\beta - \alpha)/2)$. Now $ bc\omega + ad \psi = \frac{1}{2}(ad + bc) (\psi + \omega) + \frac{1}{2}(ad - bc) (\psi - \omega)$. This then equals $\frac{1}{2}\sin((\alpha + \beta)/2) (\psi + \omega) + \frac{1}{2}\sin((\beta-\alpha)/2)(\psi - \omega)$ This is useful because $ac$ and $bd$ are not known. But we can find $\sin((\alpha + \beta)/2)$ and $\sin((\beta-\alpha)/2)$ up to sign (of each) using $ac$ and $bd$. This then becomes easy to solve for the actual coefficients $r_{ij}$ of $\omega$ (and $\psi$).

Some notes on geometric algebra in 4 dimensions

Working in $\mathbb{R}^4$ here with the geometric algebra corresponding to the usual norm, i.e. $e_i e_j = - e_j e_i$ for $i \neq j$ and $e_i e_i = 1$. The inverse of a vector $v$ is just $v / \|v\|^2$. A geometric algebra is an algebra with zero divisors, for if $v$ is a vector of norm 1 so $v v = 1$ then $\frac{1}{2}(1 + v)$ is idempotent and is also a zero divisor (by multiplying by $(1 - v)$. An element of grade $k$ is often called a $k$ vector. Now the product of two vectors $a,b$ is $g(a, b) + a\wedge b$. The product of $a$ and $b - a g(a,b)/g(a,a)$ is therefore $g(a,b) - g(a,a)g(a,b)/g(a,a) + (a \wedge (b - ag(a,b)/g(a,a)) ) = a \wedge b$ since $a \wedge a = 0$. So the grade 2 part of a 2 blade is also always a 2 blade. Put differently, $a\wedge b$ is always a 2 blade. Now given a 2 blade $a\wedge b$ then as above we can assume without loss of generality that $g(a,b) = 0$ and we can also assume that $a$ and $b$ are unit vectors. Then for any scalar $\alpha$ the two blade $a (b + \alpha a) = g(a,b+\alpha a) + a\wedge (b + \alpha a) = \alpha + a \wedge b$. So the grade 0 part is irrelevant to whether something is a 2 blade. If $A = u v$ for othogonormal vectors $u,v$ then $AA = uvuv = -uuvv = -1$. Let $A = r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4$. Then $AA = -1$ means $(r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4)(r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4) = -1$ Now $r_{ij} e_ie_j r_{ij} e_ie_j = r^2_{ij} e_ie_je_ie_j = -r^2_{ij} e_ie_ie_je_j = -r^2_{ij}$. For the mixed terms things are uglier. The $e_1e_2e_3e_4$ term is $r_{12}r_{34}e_1e_2e_3e_4 + r_{13}r_{24}e_1e_3e_2e_4 + r_{14}r_{23}e_1e_4 e_2e_3 + r_{23}r_{14}e_2e_3e_1e_4 + r_{24}r_{13}e_2e_4e_1e_3 + r_{34}r_{12}e_3e_4e_1e_2$ This is equal to $(r_{12}r_{34} - r_{13}r_{24} + r_{14}r_{23} + r_{23}r_{14} - r_{24}r_{13} + r_{34}r_{12})$. Thus $r_{12}r_{34} - r_{13}r_{24} + r_{14}r_{23} = 0$ for a blade. The $e_1e_2$ part of $AA$ is $r_{13}r_{23}e_1e_3e_2e_3 + r_{14}r_{24}e_1e_4e_2e_4 + r_{23}r_{13} e_2e_3e_1e_3 + r_{24}r_{14}e_2e_4e_1e_4 = (-r_{13}r_{32} - r_{14}r_{24} + r_{13}r_{23} + r_{24}r_{14})e_1e_2 = 0$. Let $B = r_{12} e_1e_2 + r_{13} e_1e_3 + r_{14} e_1e_4 + r_{23}e_2e_3 + r_{24}e_2e_4 + r_{34} e_3 e_4$ in $\mathbb{R}^4$ and assume that $r_{12}r_{34} - r_{13}r_{24} + r_{14}r_{23} = 0$. How big is the space of vectors $u$ with $u \wedge B = 0$? If $u = x_1 e_1 + x_2 e_2 + x_3 e_3 + x_4 e_4$ then $u\wedge B = 0$ becomes $x_1 r_{23} e_1e_2e_3 + x_1 r_{24} e_1e_2e_4 + x_1 r_{34} e_1e_3e_4 + x_2 r_{13} e_2 e_1e_3 + x_2r_{14} e_2 e_1 e_4 + x_2 r_{34}e_2e_3e_4 + x_3 r_{12} e_3 e_1e_2 + x_3 r_{14} e_3e_1e_4 + x_3 r_{24} e_3 e_2e_4 + x_4 r_{12} e_4 e_1 e_2 + x_4 r_{13} e_4 e_1 e_3 + x_4 r_{23} e_4 e_2e_3 = 0$ This then becomes the system of 4 homogeneous equations: $x_1 r_{23} - x_2 r_{13} + x_3r_{12} = 0$, $x_1 r_{24} - x_2 r_{14} + x_4r_{12} = 0$, $x_1 r_{34} - x_3 r_{14} + x_4r_{13} = 0$, $x_2 r_{34} - x_3 r_{24} + x_4r_{23} = 0$. Taking $r_{14}$ times the first equation minus $r_{13}$ times the second we obtain $x_1(r_{14}r_{23}-r_{13}r_{24}) + x_3r_{14}r_{12} - x_4r_{13}r_{12} = 0$ If $r_{12}r_{34} - r_{13}r_{24} + r_{14}r_{23} = 0$ then we can rewrite this as $-x_1 r_{12}r_{34} + x_3 r_{14}r_{12} - x_4 r_{13}r_{12} = 0$ which is $-r_{12}$ times the third equation. Similarly if we take the linear combination of the first two equations which eliminates the $x_1$ term we should obtain the fourth equation. Thus the system has rank at most two. If $B$ is a wedge of two orthonormal vectors then we can obtain the span of those vectors as the space of vectors $u$ satisfying $u\wedge B = 0$.

Rotations in 4D

So the geometric algebra specifies a rotation on $\mathbb{R}^n$ in quite a nice form. Given unit vectors $a,b$ then defining $\sigma\colon v \mapsto b a v a b$ we get the $\sigma$ is a rotation that leaves everything perpendicular to the $a,b$ plan pointwise fixed, and within the $a,b$ plane $\sigma$ is the rotation from $a$ to $b$ applied twice (so that the rotation of $\sigma$ is twice the angle between $a$ and $b$). Now in the plane if we choose $a$ to be $e_1 = (1,0)$ and we choose $b = \cos(\theta/2) e_1 + \sin(\theta/2) e_2 = (\cos(\theta/2),\sin(\theta/2))$ so that applying the rotation from $a$ to $b$ twice is a counterclockwise rotation of the plane by angle $\theta$ then the geometric product $ab = \cos(\theta/2) + \sin(\theta/2) e_1 e_2$ and the geometric $ba = \cos(\theta/2) - \sin(\theta/2) e_1 e_2$. Now in 4 dimensions a rotation is made up of a plane, and its orthogonal plane, each of them being setwise invariant, and each undergoing a rotation. Thus we can see a model of a 4 dimensional rotation by taking the product $(\cos(\alpha/2) + \sin(\alpha/2) e_1 e_2) (\cos(\beta/2) + \sin(\beta/2) e_3 e_4)$. Then $v \mapsto q^{-1} v q$ is a rotation in $\mathbb{R}^4$ which is a rotation in the direction from $e_1$ to $e_2$ by angle $\alpha$ in the $e_1,e_2$ plane, and is a rotation in the $e_3,e_4$ plane rotating in the direction from $e_3$ to $e_4$ by angle $\beta$. More interestingly, given a general unit member $z = a + b_{12} e_1 e_2 + b_{13} e_1 e_3 + b_{14} e_1 e_4 + b_{23} e_2 e_3 + b_{24} e_2 e_4 + b_{34} e_3 e_4 + c e_1 e_2 e_3 e_4$ one wonders whether we can factor it as a product of two planar orthogonal rotations. That is, can you read off the planes of rotation and the corresponding angles of rotation from any unit even member of the geometric algebra. The answer is that yes one can. Assume that the two rotations are $g = \cos(\alpha/2) + \sin(\alpha/2) (r_{12} e_1 e_2 + r_{13} e_1 e_3 + r_{14} e_1 e_4 + r_{23} e_2 e_3 + r_{24} e_2 e_4 + r_{34} e_3 e_4)$ and by $h = \cos(\beta/2) + \sin(\beta/2) (s_{12} e_1 e_2 + s_{13} e_1 e_3 + s_{14} e_1 e_4 + s_{23} e_2 e_3 + s_{24} e_2 e_4 + s_{34} e_3 e_4)$. For a rotation to represent a plane it must have $r_{12} r_{34} + r_{13} r_{24} + r_{14} r_{23} = 0$ For the planes of these two members of the product to be orthogonal we must have $r_{12} = s_{34}$, $r_{13} = s_{24}$, $r_{14} = s_{23}$, $r_{23} = s_{14}$, $r_{24} = s_{13}$, $r_{34} = s_{12}$. Consequently the grade 4 part of the product $g h$ is $r_{12} s_{34} + r_{13} s_{24} + r_{14} s_{23} + r_{23} s_{14} + r_{24} s_{13} + r_{34} s_{14}$ which is equal to $r_{12}^2 + r_{13}^2 + r_{14}^2 + r{23}^2 + r_{24}^2 + r_{34}^2 = 1$. Consequently if $z = gh$ then $a = \cos(\alpha/2) \cos(\beta/2)$ and $c= \sin(\alpha/2) \sin(\beta/2)$. We then have $a - c = \cos((\alpha + \beta)/2)$ and $a + c = \cos((\alpha - \beta)/2)$. From this we can readily extract both $\alpha$ and $\beta$. Then since $b_{ij} = \cos(\beta/2)\sin(\alpha/2) r_{ij} + \sin(\beta/2) \cos(\alpha/2) s_{ij} = \cos(\beta/2)\sin(\alpha/2) r_{ij} + \sin(\beta/2) \cos(\alpha/2) r_{kl}$ and $b_{kl} = \cos(\beta/2)\sin(\alpha/2) r_{kl} + \sin(\beta/2) \cos(\alpha/2) s_{kl} = \cos(\beta/2)\sin(\alpha/2) r_{kl} + \sin(\beta/2) \cos(\alpha/2) r_{ij}$ where $kl$ are the two members different than $ij$ we have equations we can readily solve for $r_{ij}$ and $r_{kl}$ in terms of $b_{ij}$ and $b_{kl}$ (since $\beta$ and $\alpha$ are known). Thus one can decompose any even unit member of the geometric algebra of $\mathbb{R}^4$ into two orthogonal plane rotations with known angles and directions of rotation.
So the wedge product can be defined by simply setting a basis $e_i$ and requiring that $e_i \wedge e_j = - e_j \wedge e_i$ for $i \neq j$ and that $e_i \wedge e_i = 0$ along with commuting with scalars and distribution and associativity. It can be shown then that for any vector $a$ one has $a \wedge a = 0$. The wedge product defines a bit of signed area (or volume) on an oriented plane, hyperplane, etc.... just as a vetor represents a signed quantity in a specific direction. The (typical) geometric product is defined as follows for an orthonormal basis: $e_i e_j = - e_i e_j$ for $i\neq j$ and $e_i e_i = 1$. If $a$ is a vector then $a a = a\cdot a$ the usual dot product of a with itself. In general for vectors $a$ and $b$ the geometric product is $ab = a\cdot b + a \wedge b$. If $b$ has unit length then $b b = 1$, so $b$ is its own inverse. Consider the conjugation $\tau_v \colon v \mapsto e_1 v e_1$. Then for any basis vector other than $e_1$ the result is $e_i \mapsto e_1 e_i e_1 = - e_1 e_1 e_i = - e_i$. But $e_1 \mapsto e_1 e_1 e_1 = e_1$. Thus $\tau_v$ preserves the $e_1$ axis, but is an inversion on the perpendicular space. The same is true for $\tau_a \colon v \mapsto a v a$ for any unit vector $a$. The line through $a$ is fixed pointwise, and the perpendicular space to $a$ undergoes an involution where everything is negated. The combination of $\tau_a$ followed by $\tau_b$ then fixes everything perpendicular to the $a,b$ plane. However, on the $a,b$ plane, if $h$ is rotation from $a$ to $b$ then $\tau_b\circ \tau_a$ is $h$ applied twice, giving twice the rotation of a direct rotation from $a$ to $b$. Thus the map $v \mapsto (ba)v(ab)$ rotates the $a,b$ plane as if a single rotation directly from $a$ to $b$ had been applied twice, whereas it leaves the subspace perpendicular to the $a,b$ plane pointwise fixed. The scalar part of $ab$ is the cosine of $\psi$ where $\psi$ is the angle from $a$ to $b$. Put differently, the angle of rotation is $\theta$ and the scalar part of this rotation is $\cos(\theta/2)$. Because $abab = -aabb = -1$ then if we write out the taylor series expansion for $exp(t ab)$ where t is a scalar we get $cos(t) + sin(t) ab$. The inverse of $cos(t) + sin(t)ab$ is $cos(t) + sin(t) ba = cos(t) - sin(t)ab$. Thus there is a rotation subgroup cos(t) + sin(t)ab, i.e. $v \mapsto (cos(t) - sin(t) ab) v (cos(t) + sin(t) ab)$ When $t=0$ this is the identity map and when $t=\pi/2$ this is the rotation $ v\mapsto (ba) v (ab)$ Now $ab = a\cdot b + a \wedge b$. If $a,b$ are unit vectors, then multiplying $[\cos(t) + \sin(t)ab][\cos(s) + \sin(s)ab]$ yields $\cos(t) \cos(s) + \sin(t)\sin(s)abab + \cos(t)\sin(s) ab + \sin(t)\cos(s) ab = (\cos(t) \cos(s) - \sin(t)\sin(s)) + [ \cos(t)\sin(s) + \sin(t)\cos(s)]ab$ but this is just $\cos(t + s) + \sin(t +s) ab$ In 4 dimensions a rotation is determined by the element $( \cos(s/2) + \sin(s/2) e_1 e_2 ) ( \cos(t/2) + \sin(t/2) e_3 e_4 )$ Expanding gives $\cos(s/2) \cos(t/2) + \cos(s/2) \sin(t/2) e_3 e_4 + \cos(t/2) \sin(s / 2) e_1 e_2 + \sin(s/2) \sin(t/2) e_1 e_2 e_3 e_4$ So there is a vector and a pseudovector and one is $\cos(s/2) \cos(t/2)$ and the other is $\sin(s/2) \sin(t/2) e_1 e_2 e_3 e_4$ There is also a bivector part which is where we get real direction (because the vector and pseudovector are directionless). So given a unit versor for $\mathbb{R}^4$ is there necessarily a corresponding rotation, and if so, can we read off the planes of rotations and their rotation angles as we can in 3 dimensions? That is, given $a + b_{12) e_1 e_2 + b_{13} e_1 e_3 + b_{14} e_1 e_4 + b_{23} e_2 e_3 + b_{24} e_2 e_4 + b_{34} e_3 e_4 + c e_1 e_2 e_3 e_4$ a unit vector where the sum of squares of the coefficients is one, does this always give a rotation, and if so, what are the planes and their rotations? Now the grade two part should be the sum of two different blades, each one the orthogonal complement of the other. A single 2 blade in $\mathbb{R}^4$ has grade two part satisfying $u_{12}u_{34} + u_{13}u_{24} + u{14}u_{23} = 0$. The orthogonal blade grade 2 part has $v_{12} = u_{34}$ and $v_{13} = u_{24}$ etc.... One would expect that every possible choice of $b_{ij}$ can be represented nearly uniquely as a weighted sum of two orthogonal 2 blades, i.e. choose 1, then the other is detemined, and one can choose the coefficient of the other which gives just the right dimension.

Deforming complex space using Beltrami coefficients on a pair of foliations

Since the pair of equations for deforming a pair of foliations is a pair of PDSs involving the beltrami coefficients, I checked to see whether they were simpler to work with if the beltrami coefficients were in turn represented as a ratio involving $\partial \bar{z}$ and $\partial z$ but that didn’t help.
I had worked out a series solution to them years ago, looking for simple solutions. The simple solutions all looked like they turned out to be trivial as I recall (i.e. no different than a deformation of each coordinate separately).
However, I went ahead and wrote out the series solution again just because the PDEs involved obviously have a series solution, and PDE’s can be hard to solve.
If $b_1$ and $b_2$ are the beltrami coefficients of the horizontal and vertical holomorphic leaves, the PDE pair is

\[\dfrac{\partial b_1}{\partial \bar{z}_2} = b_2 \dfrac{\partial b_1}{\partial z_2}\]

and

\[\dfrac{\partial b_2}{\partial \bar{z}_1} = b_1 \dfrac{\partial b_2}{\partial z_1}\]


You can select any two convergent series \[\sum a_{ijk} z_1^i z_2^j \bar{z}_1^k\] and \[\sum c_{ijk} z_1^i z_2^j \bar{z}_2^k\] then there is a nice series relationship that gives you actual solutions $b_1$ and $b_2$ as an infinite series. The series solution is nicer if you actually represent $b_1$ and $b_2$ as
\[ b_1 = \sum A_{ij} \bar{z}_1^i \bar{z}_2^j\]
and
\[ b_2 = \sum C_{ij} \bar{z}_1^i \bar{z}_2^j\]
where $A_{ij}$ and $C_{ij}$ are holomorphic functions. 

You then get the relationships:

\[ (j+1) A_{i,j+1} = \sum_{k \leq i, \ell \leq j} \dfrac{\partial A_{k\ell}}{\partial z_2} C_{i-k,j-\ell}\]

\[ (i+1) C_{i + 1,j} = \sum_{k \leq i, \ell \leq j} \dfrac{\partial C_{k\ell}}{\partial z_1} A_{i-k,j-\ell}\]

This recursive relationship requires that you specify each $A_{i0}$ and each $C_{0j}$ as a holomorphic function.

Of course, where resulting series converges is a different matter altogether. But it appears this gives lots of solutions to the system of PDEs. For example, if the constants $a_{ijk}$ and $c_{ijk}$ get small very rapidly, this method would give you convergent solutions to the PDE.
Now, of course, there are many solutions, because one can simply take two transverse holomorphic motions and turn them into a deformation by treating them as a coordinate system. That is, take two different pairs of transverse foliations. For each foliation in each pair choose a leaf. Choose a way to match the selected two leaves that came from one pair with the selected two leaves that came from the other pair. Now extend by treating the transverse foliations as giving coordinates.
The PDE is written for the case where one of these pairs is simply the regular horizontal and vertical leaves, and we are specifying beltrami coefficients to deform the structure of those leaves.

So there are lots of solutions. But it is interesting to have solutions specified by beltrami coefficients. It means you can deform an existing structure and still have a holomorphic space. 

Deforming $\mathbb{C}^2$ with a pair of holomorphic foliations

I went over my old notes on deforming $\mathbb{C}^2$. Pairs of laminations of $\mathbb{C}^2$ by holomorphic curves was a recurring theme in holomorphic dynamics, and I worked out the conditions using the Neienhaus tensor for doing quasiconformal deformation on a pair of laminations and obtaining a new complex structure on $\mathbb{C}^2$. The point is more or less that two transverse laminations are deformed into two different transverse laminations, which goes along with the leaves of each of those laminations undergoing a quasiconformal deformation. I had worked out a few geometric basics as well, sort of to check the viability of the idea. For smooth deformations locally one would have an $\mathbb{R}$ linear map of $\mathbb{C}^2$. It turns out that there are two open conditions for such maps - they either map two complex lines to two complex lines or else they map no complex lines to complex lines. I don't think this is a direction to pursue right now, as the condition one obtains on the beltrami coefficients is simple and elegant, yet it is a nonlinear PDE. Given the implicit geometric meaning it might be interesting in looking at whether the property of being less than norm 1 propagates in a solution.

I need to rewrite the equations using \[b_i = \dfrac{\frac{f}{zbar}}{\frac{f}{z}}\] since $b_i$ IS a beltrami coefficient.

There was some horrible thing that happened when you tried to develop a series solution, and I don't recall what it was.

Data Science - Illustration with Cancer Data

python_test
In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
In [3]:
import os
os.getcwd()
Out[3]:
'C:\\Users\\John Robertson\\Documents'
In [4]:
os.chdir('C:\\Users\\John Robertson\\Documents\\python_test')
os.getcwd()
Out[4]:
'C:\\Users\\John Robertson\\Documents\\python_test'
In [7]:
headerList = []
headerFile = open('field_names.txt','r')
for line in headerFile:
    nextHeader = line.rstrip()
    if nextHeader:
        headerList.append(nextHeader)
headerFile.close()
print(headerList)
['ID', 'diagnosis', 'radius_mean', 'radius_sd_error', 'radius_worst', 'texture_mean', 'texture_sd_error', 'texture_worst', 'perimeter_mean', 'perimeter_sd_error', 'perimeter_worst', 'area_mean', 'area_sd_error', 'area_worst', 'smoothness_mean', 'smoothness_sd_error', 'smoothness_worst', 'compactness_mean', 'compactness_sd_error', 'compactness_worst', 'concavity_mean', 'concavity_sd_error', 'concavity_worst', 'concave_points_mean', 'concave_points_sd_error', 'concave_points_worst', 'symmetry_mean', 'symmetry_sd_error', 'symmetry_worst', 'fractal_dimension_mean', 'fractal_dimension_sd_error', 'fractal_dimension_worst']
In [8]:
df = pd.read_csv('breast-cancer.csv', header = None, names = headerList, index_col = 0)
In [112]:
print("Dimensions are " + str(df.shape))
print("Number of malignant " + str(len(df[df.diagnosis == 'M'])))
print("Number of benign " + str(len(df[df.diagnosis == 'B'])))
print(df.iloc[0]) # take a look at an element
Dimensions are (569, 31)
Number of malignant 212
Number of benign 357
diagnosis                            M
radius_mean                      17.99
radius_sd_error                  10.38
radius_worst                     122.8
texture_mean                      1001
texture_sd_error                0.1184
texture_worst                   0.2776
perimeter_mean                  0.3001
perimeter_sd_error              0.1471
perimeter_worst                 0.2419
area_mean                      0.07871
area_sd_error                    1.095
area_worst                      0.9053
smoothness_mean                  8.589
smoothness_sd_error              153.4
smoothness_worst              0.006399
compactness_mean               0.04904
compactness_sd_error           0.05373
compactness_worst              0.01587
concavity_mean                 0.03003
concavity_sd_error            0.006193
concavity_worst                  25.38
concave_points_mean              17.33
concave_points_sd_error          184.6
concave_points_worst              2019
symmetry_mean                   0.1622
symmetry_sd_error               0.6656
symmetry_worst                  0.7119
fractal_dimension_mean          0.2654
fractal_dimension_sd_error      0.4601
fractal_dimension_worst         0.1189
Name: 842302, dtype: object
In [9]:
# create a normalized version, where each variable is centered and normalized by the std dev
df_numerical = df.drop(['diagnosis'],axis = 1)
df_numerical_norm = (df_numerical - df_numerical.mean())/df_numerical.std()
df_norm = df.loc[:,['diagnosis']].join(df_numerical_norm)
print(df_norm.iloc[0]) # take a look at an element
diagnosis                            M
radius_mean                     1.0961
radius_sd_error               -2.07151
radius_worst                   1.26882
texture_mean                   0.98351
texture_sd_error               1.56709
texture_worst                  3.28063
perimeter_mean                 2.65054
perimeter_sd_error             2.53025
perimeter_worst                2.21557
area_mean                      2.25376
area_sd_error                  2.48755
area_worst                   -0.564768
smoothness_mean                2.83054
smoothness_sd_error            2.48539
smoothness_worst             -0.213814
compactness_mean                1.3157
compactness_sd_error           0.72339
compactness_worst             0.660239
concavity_mean                 1.14775
concavity_sd_error            0.906286
concavity_worst                1.88503
concave_points_mean            -1.3581
concave_points_sd_error        2.30158
concave_points_worst           1.99948
symmetry_mean                  1.30654
symmetry_sd_error              2.61436
symmetry_worst                 2.10767
fractal_dimension_mean         2.29406
fractal_dimension_sd_error      2.7482
fractal_dimension_worst        1.93531
Name: 842302, dtype: object
In [10]:
# We want to plot the data
# Visualization can help us recognize dangers, unusual features, 
# and our end results should correspond with what we can see visually
# so it helps prevent techinical mistakes from 
# leading us to wrong conclusions
for label in headerList[2:]:
    bins = np.linspace(-4,4,100)
    plt.hold(True)
    plt.hist(df_norm[label][df.diagnosis == 'B'],bins, alpha = .5, label = 'B')
    plt.hist(df_norm[label][df.diagnosis == 'M'],bins, alpha = .5, label = 'M')
    plt.legend(loc='upper right')
    plt.suptitle(label)
    plt.show()
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: MatplotlibDeprecationWarning: pyplot.hold is deprecated.
    Future behavior will be consistent with the long-time default:
    plot commands add elements without first clearing the
    Axes and/or Figure.
  
C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\__init__.py:805: MatplotlibDeprecationWarning: axes.hold is deprecated. Please remove it from your matplotlibrc and/or style files.
  mplDeprecation)
C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\rcsetup.py:155: MatplotlibDeprecationWarning: axes.hold is deprecated, will be removed in 3.0
  mplDeprecation)
In [11]:
# Compute the mean and median smoothness and compactness for benign and malignant tumors - 
# do they differ? 
# Explain how you would identify this.

# Answer. We have three columns for smoothness and three columns for compactness. 
# It is not clear what smoothness_sd_error or compactness_sd_error mean. Without more understanding of the data
# I would assume that I am being asked for the mean and median of the columns smoothness_mean and compactness_mean
# it should be noted that the visual histograms plotted above give a meaningful answer already. 

# I am computing a normalized mean and median which makes it easy to tell by inspection that their difference is significant

print("Malignant smoothness mean = " + str(np.mean(df_norm.smoothness_mean[df.diagnosis == 'M'])))
print("Benign smoothness mean = " + str(np.mean(df_norm.smoothness_mean[df.diagnosis == 'B'])))

print("Malignant smoothness median = " + str(np.median(df_norm.smoothness_mean[df.diagnosis == 'M'])))
print("Benign smoothness median = " + str(np.median(df_norm.smoothness_mean[df.diagnosis == 'B'])))

# If this was for a scientific study was going to be published then I would use traditional statistical tests -- some version of 
# students t-test is the standard I believe.
# Problems like this that are classical statistics
# are not commonly called "big data" because they were doable before the days of terabytes of data and hardware
# capable of processing that. In fact, they could be computed (tediously) before computers existed by hand. 
# By inspection, since there are 500 computations, the variation in the means should be on magnitute of 1/sqrt(200) or about 1/14
# But instead they differ by 1.1 and by .9. So they are roughly 8 sample deviations apart, which means they are genuinely different
Malignant smoothness mean = 0.7210558324558541
Benign smoothness mean = -0.4281900181530531
Malignant smoothness median = 0.40232407996917197
Benign smoothness median = -0.502043643388796
In [12]:
# Write a function to generate bootstrap samples of the data

# Bootstrap samples are samples with replacement so to get a sample of N rows with replacement we would use

from random import randint

def getSamples(n,dataFrame):
    newList = []
    rowCount = len(dataFrame)
    for i in range(n):
        newList.append(randint(0, rowCount-1))
    return df.iloc[newList]
In [13]:
# test our Bootstrap function to generate a set of 10 samples

getSamples(10,df_norm)
Out[13]:
diagnosis radius_mean radius_sd_error radius_worst texture_mean texture_sd_error texture_worst perimeter_mean perimeter_sd_error perimeter_worst ... concavity_worst concave_points_mean concave_points_sd_error concave_points_worst symmetry_mean symmetry_sd_error symmetry_worst fractal_dimension_mean fractal_dimension_sd_error fractal_dimension_worst
ID
865468 B 13.37 16.39 86.10 553.5 0.07115 0.07325 0.080920 0.028000 0.1422 ... 14.260 22.75 91.99 632.1 0.10250 0.25310 0.33080 0.08978 0.2048 0.07628
89346 B 9.00 14.40 56.36 246.3 0.07005 0.03116 0.003681 0.003472 0.1788 ... 9.699 20.07 60.90 285.5 0.09861 0.05232 0.01472 0.01389 0.2991 0.07804
874858 M 14.22 23.12 94.37 609.9 0.10750 0.24130 0.198100 0.066180 0.2384 ... 15.740 37.18 106.40 762.4 0.15330 0.93270 0.84880 0.17720 0.5166 0.14460
857010 M 18.65 17.60 123.70 1076.0 0.10990 0.16860 0.197400 0.100900 0.1907 ... 22.820 21.32 150.60 1567.0 0.16790 0.50900 0.73450 0.23780 0.3799 0.09185
869218 B 11.43 17.31 73.66 398.0 0.10920 0.09486 0.020310 0.018610 0.1645 ... 12.780 26.76 82.66 503.0 0.14130 0.17920 0.07708 0.06402 0.2584 0.08096
864726 B 8.95 15.76 58.74 245.2 0.09462 0.12430 0.092630 0.023080 0.1305 ... 9.414 17.07 63.34 270.0 0.11790 0.18790 0.15440 0.03846 0.1652 0.07722
901028 B 13.87 16.21 88.52 593.7 0.08743 0.05492 0.015020 0.020880 0.1424 ... 15.110 25.58 96.74 694.4 0.11530 0.10080 0.05285 0.05556 0.2362 0.07113
923465 B 10.82 24.21 68.89 361.6 0.08192 0.06602 0.015480 0.008160 0.1976 ... 13.030 31.45 83.90 505.6 0.12040 0.16330 0.06194 0.03264 0.3059 0.07626
91376702 B 17.85 13.23 114.60 992.1 0.07838 0.06217 0.044450 0.041780 0.1220 ... 19.820 18.42 127.10 1210.0 0.09862 0.09976 0.10480 0.08341 0.1783 0.05871
91544002 B 11.06 17.12 71.25 366.5 0.11940 0.10710 0.040630 0.042680 0.1954 ... 11.690 20.74 76.08 411.1 0.16620 0.20310 0.12560 0.09514 0.2780 0.11680

10 rows × 31 columns

In [14]:
# Random forest variable importance is a common way
# to pick out which variables are most important

from sklearn.ensemble import ExtraTreesClassifier

forest = ExtraTreesClassifier(n_estimators = 500)
forest.fit(df_numerical_norm, df.diagnosis)
importances = forest.feature_importances_
# importance_stds = np.std([tree.feature_importances_ for tree in forest.estimators_], axis = 0)
importance_indices = np.argsort( importances )[::-1]
for i in range(df_numerical_norm.shape[1]):
    print( list(df_numerical_norm)[i] + " " + str(importances[i]))
print("-------------")
print("In order of importance")
for i in range(df_numerical_norm.shape[1]):
    j = importance_indices[i]
    print( list(df_numerical_norm)[j] + " " + str(importances[j]))

plt.plot(range( len(importance_indices)), importances[ importance_indices ], 'ro')
plt.show()
radius_mean 0.053421045747899666
radius_sd_error 0.018758082281329424
radius_worst 0.059980675352813054
texture_mean 0.05582382656416694
texture_sd_error 0.010970244884948332
texture_worst 0.018674215638445447
perimeter_mean 0.061087710397355624
perimeter_sd_error 0.09380445329204413
perimeter_worst 0.007643975139285527
area_mean 0.006275077476443084
area_sd_error 0.02357754182383073
area_worst 0.005203512929981728
smoothness_mean 0.020029228669550432
smoothness_sd_error 0.037318116760044644
smoothness_worst 0.006180432373471859
compactness_mean 0.0068107591790187855
compactness_sd_error 0.008274600673487047
compactness_worst 0.009290031042625912
concavity_mean 0.006046108181319825
concavity_sd_error 0.00630317987710586
concavity_worst 0.0947999219538862
concave_points_mean 0.025900099207817617
concave_points_sd_error 0.08206509580588969
concave_points_worst 0.08425461459043444
symmetry_mean 0.01957570597038478
symmetry_sd_error 0.02465788075135929
symmetry_worst 0.041246989728128326
fractal_dimension_mean 0.0868849053263718
fractal_dimension_sd_error 0.015133958397684163
fractal_dimension_worst 0.010008009982875762
-------------
In order of importance
concavity_worst 0.0947999219538862
perimeter_sd_error 0.09380445329204413
fractal_dimension_mean 0.0868849053263718
concave_points_worst 0.08425461459043444
concave_points_sd_error 0.08206509580588969
perimeter_mean 0.061087710397355624
radius_worst 0.059980675352813054
texture_mean 0.05582382656416694
radius_mean 0.053421045747899666
symmetry_worst 0.041246989728128326
smoothness_sd_error 0.037318116760044644
concave_points_mean 0.025900099207817617
symmetry_sd_error 0.02465788075135929
area_sd_error 0.02357754182383073
smoothness_mean 0.020029228669550432
symmetry_mean 0.01957570597038478
radius_sd_error 0.018758082281329424
texture_worst 0.018674215638445447
fractal_dimension_sd_error 0.015133958397684163
texture_sd_error 0.010970244884948332
fractal_dimension_worst 0.010008009982875762
compactness_worst 0.009290031042625912
compactness_sd_error 0.008274600673487047
perimeter_worst 0.007643975139285527
compactness_mean 0.0068107591790187855
concavity_sd_error 0.00630317987710586
area_mean 0.006275077476443084
smoothness_worst 0.006180432373471859
concavity_mean 0.006046108181319825
area_worst 0.005203512929981728
In [15]:
# The plot of variable importance using random forests is very useful
# Offhand, it is not necessarily best to just grab the top 3 or 5 
# most important variables. We see distinct groups of variables with 
# comparable importance in this plot, and it may be that they have comparable
# importance because they are strongly correlated, i.e. possibly variables ranked 
# 6,7,8 above are so close in importance because they are tightly correlated
# and each one gives no more information than the others. But we have cut down the 
# playing field of interesting variables significantly.
In [37]:
# Identify 2-3 variables that are predictive of a malignant tumor.
# Display the relationship visually and write 1-2 sentences explaining the relationship.

# The two strongest ones are fractal_dimension_mean and concavity_worst and malignant tumors 
# have larger values of both of those. I don't know precisely how those geometric quantities were
# measured. Offhand, one sounds like it means malignant tumors have a more pitted and crinkled surface.
# I have already displayed the relationship visually with the histograms above.

plt.plot(df_norm.fractal_dimension_mean[df.diagnosis == 'B'], df_norm.concavity_worst[df.diagnosis == 'B'],'o', alpha = 0.2, label='Benign')
plt.plot(df_norm.fractal_dimension_mean[df.diagnosis == 'M'], df_norm.concavity_worst[df.diagnosis == 'M'],'o', alpha = 0.2, label='Malignant')
plt.legend(loc = 'upper left')
plt.axis('scaled')
plt.show()
# Plotting these two variables for both groups together it appears that they are not too strongly correlated 
# and that each of these two variable independently helps reduce the overlap between malignant and benign tumors
# That is, the x,y pairs are more separated than either the x coordinates alone or the y coordinates alone would be
In [18]:
from sklearn.cross_validation import cross_val_score
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
In [19]:
X = df_numerical_norm
Y = df_norm.diagnosis
In [20]:
forest = ExtraTreesClassifier(n_estimators = 500)
In [21]:
forest_result = cross_val_score(forest, X, Y, cv = 5)
In [22]:
print(forest_result)
[0.94782609 0.96521739 0.98230088 0.96460177 0.96460177]
In [23]:
# These scores are the portion of correctly classified samples. 
# These are good scores, and they are consistent scores.
# One of the downsides of cross validation in python is that it doesn't
# return the scores on the training set as well as on the test set.
# You will normally see better scores on the training set than on the 
# test set. But if you see significantly better scores on the 
# training set than on the test set, that is because you are overfitting
# the data. Effectively, these scores are so high that we 
# know we are not overfitting dramatically anyway.
In [24]:
# I already determine the most important variables in a random forest model
In [25]:
# I like SVMs but they are poor at helping you identify the most important variables
# So for the second case I will just use linear regression
In [26]:
from sklearn.svm import SVC
In [27]:
svm = SVC()
In [28]:
svmResult = cross_val_score(svm, X, Y, cv = 5)
In [29]:
svmResult
Out[29]:
array([0.97391304, 0.96521739, 1.        , 0.96460177, 0.97345133])
In [30]:
# It is not easy from an SVM to determine what the most important variables are.
# SVMs are more of a black box. They are best where there is 
# sparse data and you want a black box predictor rather than insight
# about the meaning of the predictions. That is why they
# are used so frequently in computer vision when the data is ALWAYS sparse.
# We know we didn't overfit because the results are so very high. 
# Overfitting with a linear SVM is very unlikely. They specialize in being robust
# against overfitting.

More about rotations in 4 dimensions.

It is easy to take a quaternion and recognize the rotation it actually represents. This is about doing the same thing for 4 dimensional rota...