Blog

Margin of Error

With the pending USA election, the news is awash with poll results showing a candidate’s preference by voters. And when these results are presented, there is usually a caveat that “however, these results are within the margin of error” which actually makes the results a bit less conclusive. Why is that?

Without going through the plethora of maths that arrives at what follows, let me explain.

If we are trying to determine a parameter of a population (like the percentage of people that prefer a candidate), we need to ask everyone in a population in order to know the answer exactly. This is impossible in many situations, especially in the USA where all the people who will vote cannot be asked the question. So a sample of voters must be used. Now there are a lot of things to be considered to make sure that the sample used is truly random (that is, not biased), but let’s assume going forward, that the samples used are random.

First, without the math, for large samples, the distribution of the parameter being measured is approximately normal. This is fancy statistical wording that means the values one gets taking sample after sample will follow a bell curve:

This curve is adjusted so that the probability of the parameter of interest between two values is the area under the curve. This means that the area under the entire curve must be 100%. So the probability that the parameter is between a and b, based on the sample, is the shaded area below:

If this is 68% of the total area, then that is the probability that the parameter being measured is between a and b.

Now let’s get to the current scenario. Suppose 1000 people are surveyed and 52% prefer candidate A and 48% prefer candidate B. Let’s look at the associated bell curve for candidate A:

A lot of math is involved here and a lot of assumptions (though they are reasonable). Notice that the curve is centered at the sample result of 52% (0.52 is the decimal equivalent). The range 0.49 to 0. 55, which is 0.52 – 0.03 and 0.52 + 0.03, are the numbers that include 95% of the area (that is probability). Without going through a lot of theory here, this range of numbers is the 95% confidence interval for this sample. So a statistician can say “based on this sample, I am 95% confident that the true percentage of all voters who support candidate A is between 49% and 55%”. This means that based on this sample, the true preference for candidate A can be as low as 49%. The number 0.03 which is added and subtracted from the sample result is called the margin of error. This 95% confidence interval is the most common one used.

Now let’s look at the bell curve for candidate B and its associated confidence interval:

Notice that the 95% confidence interval for this result is 45% to 51%. That is , based on this sample, we can be 95% confident that the true percentage of all voters who support candidate B is between 45% and 51%. This means that the true preference for candidate B can be as high as 51%. And this is higher than the possible low preference of candidate A at 49%. That means that even though the sample shows that candidate A is preferred, the difference between the two values are not significant enough to make the statement that candidate A is truly preferred at the 95% confidence interval. In other words, the result is within the margin of error.

Now let’s say that the survey result was that candidate A was preferred at 55% and candidate B at 45%. The confidence interval of candidate A’s bell curve would be 52% to 58%. Candidate B’s confidence interval would be 42% to 48%. So based on this sample, the highest that the true preference of candidate B is 48% and the lowest preference of candidate A would be 52%. There is no overlap here so this would be significant enough to say that all voters do prefer candidate A. That is, the result is outside the margin of error. And when you see results with that statement, that is much better for candidate A.

Coordinate Systems – Parametric Equations

A recurring property of coordinate systems is that in order to locate a point in an n-dimensional space, you need n numbers (or n independent pieces of information about that point). There is a way to “cheat” this using just one number (called a parameter) to locate an n-dimensional point.

This isn’t really cheating as you still have to initially provide the required information, but once done, one number will suffice to place a point.

An example of this is the equation of a circle of radius r:

\[x^2+y^2=r^2\]

This is the standard Cartesian equation but a parametric way of defining a circle, using a parameter t, is

\[x=r\,\text{cos}(t)\\y=r\,\text{sin}(t)\]

Defined in this way, any value of t will generate a point on the same circle. We can generate the Cartesian equation from these two parametric equations, but I will leave that as a topic for a future post.

Parametric equations can be a much more useful way to represent a curve, especially curves that model a physical process.

If a projectile is launched with an initial speed of 31.6267 m/s at an angle of 50.78Β° from the ground, it will follow a parabolic trajectory which can be represented by the equation

\[y=-0.01225(x-50)^2+30.625\]

where y is the height above the ground and x is the distance along the ground from the launch point. The trajectory of the projectile (the graph of the above equation) looks like

This graph is useful in that it tells us how high the projectile goes and how far. But it doesn’t tell us where the ball is at any time or how long it take to hit the ground.

The initial velocity can be broken up into a horizontal and a vertical component. These components can be treated separately:

If resistance due to the air is neglected, the horizontal distance at a given time t is x = 20t. The vertical distance cannot be treated as simply as the vertical velocity is constantly changing due to gravity. From physics and calculus, the vertical distance is y = -4.9t2 + 24.5t. These two equations are the parametric forms of the Cartesian trajectory equation. For any time t, a point on this trajectory, (20t, -4.9t2 + 24.5t), is located and represents where the projectile is at that time:

Notice how we get more information about what is going on with this way of representing a graph. We can now tell where the projectile is at any time t, that it takes 2.5 seconds to reach the top of the trajectory, and that it takes 5 seconds to return to the ground.

This is not a unique way to represent this trajectory parametrically, but this one conforms to the physics of the problem. In general, parametric equations make it possible to plot graphs that are difficult or even impossible to plot with a single Cartesian equation.

This can be extended to higher dimensions as well. There is a lot of mathematics around parametric equations and it adds to the wonder (and complexity) of maths.

Coordinate Systems – 3D, part 2

Once again, if you want to locate a point in 3-dimensional space, you need 3 numbers. In my last post, the 2-D Cartesian coordinate system (sometimes called the rectangular coordinate system) was extended to 3-D by adding another axis that is perpendicular to the other two axes. A point is then located using the coordinates (x, y, z), (1,βˆ’2,3) for example. Here are two more ways to locate a 3-D point that uses the rectangular system as a backdrop.

Spherical Coordinate System

If you remember in the 2-D scenario, polar coordinates used an angle (πœƒ) from the positive x-axis and a distance (r) from the origin to determine the location of a point. And equations to represent a plot of points that satisfied the relationship between these coordinates had r‘s and πœƒ’s in them. In 3-D, the spherical coordinate system extends this method.

There are different conventions here but they all use two angles and a distance. The mathematical convention is shown below:

Source: https://en.wikipedia.org/wiki/Spherical_coordinate_system#/media/File:3D_Spherical_2.svg

Here, a point is located by an angle from the positive x-axis, πœƒ, (like in polar coordinates), an angle from the positive z-axis, πœ‘, and a distance, r, from the origin. A point in this system has coordinates (r, πœƒ, πœ‘). As with polar coordinates, there are curves that are more easily expressed in spherical coordinates. For example, a sphere of radius 4 centred at the origin can be easily expressed in spherical coordinates as r = 4:

Or how about:

\[r=4\text{cos}(2\theta)\text{sin}^2(\varphi)\]

There are other conventions for spherical coordinates, one of which you are very familiar with. Locating a point on the earth is typically done with two numbers, longitude and latitude. Longitude is the angle a location is from the agreed reference meridian that runs through Greenwich England, and latitude is its angle from the equator. If the origin is at the earth’s centre with the x-axis going through the reference meridian (called the prime meridian) and the z-axis going through the north pole, longitude is our πœƒ, latitude is 90Β° βˆ’ πœ‘, and r is always the radius of the earth.

Another convention is used to locate earth satellites using angles right ascension (similar to longitude) and declination (similar to latitude) from an agreed earth centred coordinate system where the axes are fixed and do not rotate with the earth.

There are other variations of this coordinate system; these are just a few.

Cylindrical Coordinate System

You can think of the cylindrical coordinate system as the 2D polar system with an added z coordinate:

Source: https://tutorial.math.lamar.edu/classes/calciii/CylindricalCoords.aspx

Different letters/Greek symbols can be used, but they all represent the same system. If you look at the xy plane above, you see that this is just the polar coordinate system that was explained in a previous post. To add the third dimension, just move up z units to the desired point (r, πœƒ, z).

Cylindrical coordinates are useful in putting objects that are symmetrical with respect to the z-axis. For example, a cylinder of radius 4 can be easily described with the equation r = 4:

r = 4

Another example is a cone: z = r:

z = r

Switching between rectangular, spherical, and cylindrical coordinates is a useful tool in calculus. An equation expressed in one of these systems may be unsolvable but solvable in a different system.

In my next post, I’ll describe a sneaky way to locate a point in 2 or 3-D with one number: parametric equations.

Coordinate Systems – 3D, part 1

Since we live in a 3 dimensional world, many problems we encounter in fields such as science and engineering, as well as others, are modelled mathematically using 3 variables, hence, 3D.

The first coordinate system introduced to students to handle 3 variables is an extension of the 2D Cartesian coordinate system. If another number line is added to the 2D system that is 90Β° t0 the previous 2 axes, with the origin coinciding with the other two origins, you have the 3D system. The third axis is called the z-axis. So a point now needs 3 numbers to place it in 3D space: (x,y,z). Frequently, to draw a 3D grid on a 2D surface, the y and z axes are drawn in he plane of the surface and the x axis is drawn in perspective to show that it is perpendicular to the surface. So placing a point in a 3D Cartesian frame is an artistic challenge for me but drawing dashed lines parallel to the axes helps:

There are other orientations of the 3 axes when showing them in 2D, but this is a very common one.

As with the 2D Cartesian coordinate system, equations relating the variables x, y, and z can be plotted, showing all the values of x, y, and z that make the equation true.

In 2D, a general equation of a line is ax + by = c, where the a, b, and c are specific numbers. For example, the set of points that satisfy the equation 2x -3y = 7, plot as a straight line. By extension, in 3D, the general linear equation is ax + by + cz = d. Though this is called a linear equation, it plots as a plane in 3D:

The 3D version of a circle in 2D is a sphere. The generic equation of a sphere of radius r centred at the origin is x2 + y2+ x2 = r2:

Very interesting shapes can be made using 3D graphs. Here are a few:

\[z=5e^{-0.2(x^2+y^2)}\text{cos}(x^2+y^2)\]
\[z = \pm \sqrt{0.4^2-\left(2-\sqrt{x^2+y^2}\right)^2}\]
\[z=4 e^{-\frac{1}{4} y^2} \sin (2 x)\]

As with 2D, there are other ways of locating points in 3D. I will present some of these in my next post.

Coordinate Systems – 2D, part 2

In my last post, I talked about the Cartesian coordinate system where a point or a set of points can be located using the two numbers (x, y). There is another popular coordinate system that also locates a point in 2D space.

In the graph below, I have plotted the point (5, 3) in the Cartesian coordinate system we now know very well. I have added a line from the origin to that point and noted that the line makes an angle πœƒ with the x-axis and that the length of the line is r. I’ve also added perpendicular lines from the point to the x and y axes to show that similar right triangles are formed:

From this graph, you can see that the right triangles have sides of lengths 5 and 3 units. From the Pythagorean Theorem,

\[r=\sqrt{3^2+5^2}\approx5.83\]

And from trigonometry:

\[\text{tan}(\theta)=\frac{3}{5}\Rightarrow\theta\approx30.96\text{Β°}\]

Why did I do this? Another way to locate that same point is to 1) define a line (also called a ray) from the origin that is 30.96Β° from the x-axis then, 2) go along that line 5.83 units and stop. That is your point. Welcome to polar coordinates.

This system of locating a point in 2D is called “polar” because the origin is a “pole” from which all the rays that you can define radiate from. In the polar coordinate system, you also need two numbers to locate a point: r and πœƒ. Conventionally, a point in polar coordinates is given in the order (r, πœƒ).

The variable r is a point’s distance from the origin. πœƒ is the angle measured from the postive x-axis: anti-clockwise is + and clockwise is βˆ’. Because angles repeat every 360Β° or 2πœ‹ radians, a particular (r, πœƒ) for a point is not unique. For example, (2, 25Β°) locates the same point as (2, 385Β°).

Graphing relations is usually done by plotting r as a function of πœƒ. Just as in Cartesian coordinates, the polar graph of an equation between r and πœƒ is a picture of all the points whose (r, πœƒ) coordinates satisfy the equation. For example, the graph below are all points that satisfy r = 2cos(2πœƒ):

Notice how a grid of concentric circles (possible r values) and rays (possible πœƒ values) is super-imposed on the x and y axes. This is a polar graph grid.

There are Cartesian graphs that are more easily expressed and plotted in polar coordinates (and vice-versa). One glaring example is a circle. In the Cartesian frame, the equation of a circle, centred at the origin, is

\[x^2+y^2=r^2\]

where r is the radius. For a circle of radius 2, the above equation would have 4 on the right side and the graph would be a circle of radius 2 centred at the origin. In polar coordinates, the same graph would be r = 2. This is a picture of all points that are 2 units away from the origin:

In orbital dynamics, polar plots are most useful plotting a 2-body orbit. What is meant by “2-body” will be the subject of another post. The path of most orbits of satellites around the earth, are approximated by the ellipse. In Cartesian coordinates, the equation of an ellipse is:

\[\frac{x^2}{a^2}+\frac{y^2}{b^2}=1\]

The parameters a and b determine the size and orientation (long side vertical or horizontal) of the ellipse. For example,

The problem with this plot is that the geometric centre of the ellipse is at the origin. The path of an earth satellite is not the path followed in this plot if the earth is at the origin. The earth is at one of two special points associated with an ellipse called foci (singular focus). It is more useful in orbital dynamics if the ellipse were plotted in polar coordinates. The polar equation of an ellipse (actually any conic shape which includes circles, parabolas, and hyperbolas) is

\[r=\frac{p}{1+e\text{cos}(\theta)}\]

where p and e are parameters that determine the size and the shape (circle, ellipse, parabola, or hyperbola) of the orbit. The parameter p is the y-intercept on a superimposed Cartesian frame and we will limit e to be strictly between 0 and 1 which makes the equation plot as an ellipse. This equation, by the way, is called the orbit equation because it accurately describes the shape of any orbit between two point masses without being perturbed by other masses. An example of an elliptical orbit around the earth with a satellite at a particular position is:

This polar plot is more useful to describe orbits because the earth is at the origin and it shows three of the parameters commonly used to describe a satellite’s position and orbit: p (called the semi-latus rectum), e (called the eccentricity), and πœƒ (called the true anomaly).

Polar plots can generate shapes that would be unwieldy to generate in the Cartesian frame:

There are other less popular 2D coordinate systems like the parabolic coordinate system. Here is what parabolic graph paper looks like:

I personally do not want to go there.

Coordinate Systems – 2D, part 1

How do you locate a point on a two-dimensional (2D) surface. Since we are now in two dimensions, it will take a minimum of 2 numbers to locate a point. As in the case for 1D, the 2D surface used can be flat (which this post talks about) or curved: for example the surface of the Earth where the most common system to locate a point is the Geographic Coordinate System using latitude and longitude (again, two numbers to locate a point).

Cartesian Coordinate System

The coordinate system most used by students of mathematics is the Cartesian Coordinate System. This was invented (and named after) RenΓ© Descartes in the 17th century. This system is used in 3D as well as higher dimensions, but this post is limited to 2D. As most people best learn and retain mathematical concepts visually, this system of plotting was, and still is, indispensable in algebra, calculus, geometry, trigonometry, and many more subjects. So what is the Cartesian Coordinate System?

If you take two 1D number lines, one horizontal and the other vertical so that they are at 90Β° to one another and that their origins intersect, voilΓ , you have a Cartesian Coordinate System:

The system above also has a superimposed grid so that we can more easily located a point.

Conventionally, the horizontal line is called the x-axis, and the vertical one the y-axis. Note the negative numbers are to the left and down. A point on a plane which has this system of location, is said to have coordinates (x, y). Note that x is always first. So a general point (x, y) will have a position such that it is x units left or right of the y-axis and y units above or below the x-axis. Here are some examples:

Analysing points and shapes plotted on a Cartesian coordinate system is called Coordinate Geometry. The lengths and midpoints of plotted lines with defined endpoints can be calculated. But the much more interesting use of a 2D coordinate system is plotting all the points that satisfy a relation between x and y values. This is called plotting an equation.

Suppose you have a relationship (equation) x2 + y2 = 4. What are the values of x and y that satisfy this equation? There are an infinite number of (x, y) pairs that will solve this equation. For example, (0, 2) solves this equation because 02 + 22 = 4. Even though there are infinite solutions, we can draw a picture of all the points that do solve the equation:

As you can see, the set of all points that solve this equation plots as a circle of radius 2. Plots of other equation can look quite strange:

But it is important to remember that the (x, y) coordinates of any point on the graph of a relation, makes the equation true when you substitute those values into it.

The Cartesian coordinate system is not the only way to locate a point in 2D. I will talk about another popular 2D coordinate sytstem in my next post.

Coordinate Systems – 1D

Many of the posts I have written, had plots of functions or relations between two variables, usually x and y. Most of teaching algebra and calculus relies on graphs to illustrate concepts. These graphs are plots of all the points that satisfy an algebraic relation between the two (or more) variables. Behind these plots is the coordinate system used. This series of posts explores the different coordinate systems commonly used in maths. Let’s first look at a one dimension (1D) coordinate system.

1D means that one number is needed to locate a point. The most used 1D coordinate system is the number line:

Number lines can be vertical or even curvy, for example, to show distance along a path. Usually though, the number line is a straight horizontal line. But they all have some things on common. First, they have to have a reference point: a point from which all other points obtain their position. This point here and in all coordinate systems is called the origin. And second, there is a scale: the distance between the tick marks that allow us to place a point. In the example above, the scale is 1 unit between tick marks. For example, if we want to plot the variable x = 5, the plot would be

There are an infinite number of points on this line: an infinite number of tick marks and an infinite number of points between each tick mark. What are the kinds of numbers that can be plotted?

Any number on the number line is called a real number. This is an actual mathematical term to distinguish these from other types of numbers used in maths such as imaginary numbers (despite the name, imaginary numbers have a real meaning in science and engineering). The set of real numbers is represented by the symbol ℝ. There are several subsets of real numbers.

The first set of numbers you learned as a child were the natural numbers. These are the counting numbers 1, 2, 3, … but do not include 0. This set of numbers is given the symbol β„•.

Then you learned about 0 and negative integers. Integers are whole numbers (no decimals or fraction parts) and include the natural numbers, 0, and the negative integers. This set of numbers is given the symbol β„€. Why not 𝕀? Because 𝕀 is the symbol for imaginary numbers which are not real numbers and 𝕀 is also sometimes used to refer to irrational numbers which I will talk about soon. Notice that β„• is a subset of β„€ which is a subset of ℝ.

The next type of real numbers is the set of rational numbers. These are numbers that can be put into the form p/q where p and q are integers. Any integer is a rational number like 2 since 2 can be written as 2/1. Any decimal number with a repeating pattern of decimals (even if that is a repeating 0) is a rational number. As ℝ is already used for real numbers, this set of numbers is given the symbol β„š. This stands for quotient as p/q is a quotient (a maths term for division). All of the previous sets of numbers are subsets of ℝ.

That leaves the set of irrational numbers: the numbers that cannot be put into the form p/q. Numbers like πœ‹ or √2 are irrational and symbols like these are the only way to represent the exact values. They cannot be exactly represented as a decimal number as their decimal parts never repeat. There is no common symbol for these but β„™ or 𝕀 are sometimes used. There are few occasions where only irrational numbers are required, but a more common notation would be ℝ\β„š which means “all real numbers except rational numbers”. Here is a nice picture of how all these types of real numbers are related:

It’s the irrational and some of the rational numbers that lie between the tick marks. So πœ‹ would be approximately

Plotting single points on the number line is rather boring. But it can also be used to indicate intervals of numbers like all the numbers between βˆ’6 and 2. This is shown as βˆ’6 < x < 2 where the endpoints are not included or βˆ’6 ≀ x ≀ 2 if both endpoints are included or a combination. When plotting these, an open circle means that the endpoint is not included and a filled in circle means that it is included. So βˆ’6 < x ≀ 2 would plotted

There’s not much else we can do when using the 1D number line, but we have a lot more options when expanding to 2D: to be continued.

Engineering Topics – Differential Equations

A differential equation is an equation that has a derivative in it. A derivative is a rate of change, like velocity. So if you are driving in a car where your velocity from a starting point is v which is some function of time, this can be solved to find your position from a starting point at any time. There are lots of techniques to solve these equations and the study of this scares many students. But the fact is, if you drive a car, pick up a glass of water, throw a ball at a target, your mind subconsciously handles the differential equations that model these activities quite well.

Let’s stick with the car example. Suppose you are 100 metres away from a stop sign or a red traffic light. So you apply the brakes. Let x(t) be your distance from the stop sign in metres at a time t, t be the time in seconds, and αΊ‹(t) be your velocity at time (t). Now you want your distance to the stop sign to decrease from 100 to 0 metres comfortably in say 15 seconds. What about this linear way:

It does stop at the stop sign, but is it comfortable? This equation of a line has a constant slope (rate of change) of βˆ’100/15 = βˆ’6.67 m/s = βˆ’24 km/hr. This means that at the stop sign, you are going 24 km/hr when you hit the brakes hard to stop. The passengers drinking coffee at the time, would not appreciate that. Also, if you are going 100 km/hr at 100 meters away, to follow this profile, you have to slam on the brakes to suddenly get at a speed of 24 km/hr. Well maybe there would be no coffee left to spill at the stop sign.

So this shows that we need to be aware of our speed and our distance to do this comfortably.

What about stopping following this red curve:

This starts at 100 metres and ends at 0 like the linear graph but has a better rate of change profile. The rate of change (that is the velocity) varies on this curve. It is visually seen as the slope of the line tangent to the graph at a point. The grey line shown is an example of a tangent line. Notice that the gradient at the beginning of the curve is high (in the negative direction) but at the end, it is near zero (the slope of a horizontal line is zero). This would be a much smoother stop than a linear approach.

But your mind during this action is not just seeing your distance from the stop sign, it is also sensing your velocity and adjusting it as you get closer to the stop sign. The following is an equation that relates the velocity and position:

\[\dot{x}(t)=-0.3[x(t)+1.11]\]

If you were to solve this equation for x(t) using differential equation techniques, you would get the equation seen in the graph above. If you were to design a control system (which is what your mind is when performing this action) you would use the above differential equation to control both your position and your velocity.

But even this stopping profile has flaws. Notice that the deceleration at the beginning is quite steep (the slope of a tangent line at t = 0). Perhaps a better profile would be:

This starts with a more gentle deceleration, increases the deceleration until you get closer to the stop sign, then the deceleration decreases until you come to a full stop at the sign.

Regardless of the stopping profile used, your mind controls the braking action to conform to a desired profile based on your current speed (the slope of the tangent line) and your distance from the stop sign. People who are designing driverless cars, robotic arms, aircraft autopilots, etc, use differential equations. And because they are working in three dimensions, these equations can be in the form of matrix and/or vector equations. And the solutions will use complex numbers: all of these topics were covered in my last few posts.

So besides the basic algebraic skills you may be studying or have studied, more advanced topics like this one or those covered in in my last few post are the heart of engineering.

Engineering Topics – Matrices

If you looked at my last post on complex numbers, the example I used was 1-dimensional, there was only one variable of interest. However, if you include time, the problem is 2-dimensional and we can plot the result on a 2-dimensional coordinate system. But our world is 3-dimensional (4 if you include time). In engineering, there are frequently many more than 3 variables needed to solve a problem. To handle these kind of problems, matrices are needed. Matrices come with their own set of algebra rules, but you don’t need to know these to follow this post. Let’s look at some examples.

System of Equations

Perhaps the simplest example that looks intuitively correct is using matrices to solve a set of m linear equations with m unknowns. In year 10, students are taught how to solve a system of 2 equations and 2 unknowns. They learn about two methods to solve these: substitution and elimination. These methods can be used for higher number of unknowns but this quickly becomes unwieldy as the number of unknowns increases.

An engineering example using 3 unknowns comes from my experience as an astronautical engineer. I was tasked to model the output of a mechanical gyroscope. A gyroscope is used to measure rotation about a specific direction (axis). This information is used in an inertial navigation system to determine an object’s orientation and velocity. There are several kinds of gyroscopes: mechanical, laser, semiconductor. The one in your smartphone is a semiconductor one. Mechanical ones are still used in aircraft and spacecraft navigation systems because of their long-term stability. However, their electronic output is not directly proportional to the rotational input, there are errors in the output. In an ideal world, the output, v, will be kΟ‰ = v where Ο‰ is the rotation detected (radians/second), v is the output voltage, and k is the conversion factor needed to convert radians/second to volts. But as our world is not ideal, there are errors in the signal produced by the gyroscope. These errors need to be subtracted from the output before it is sent to the navigation system.

The model I used had many more terms in it, but for purposes of this post, I will simplify it to

\[k_1\omega_1+k_2\omega_2+k_3\omega_3=v\]

where k1Ο‰1 is the desired output in direction 1, and the other two terms are errors introduced from rotations about the other 2 perpendicular directions in 3-dimensional space. For a particular gyroscope, I had to find the k‘s so that the navigation system would know what the actual rotation about direction 1 is.

So the gyroscope was placed on a very accurate test platform where its orientation with respect to the earth’s rotation was accurately known and the platform could also rotate an accurately known amount. So if the gyroscope was subjected to three different orientations/rotations and the output measured at each position, three equations in the 3 k unknowns could be generated. Actually, many more measurements were made. There are errors in the measurements of the outputs and the inputs so I actually used a least squares matrix process (yes, statistics is used in engineering as well) to find the best estimates of the k‘s. But again, for purposes of this post, let’s assume we have perfect knowledge of the inputs and outputs so that only three measurements are needed.

Using different rotation rates about the primary measurement axis and the two perpendicular ones, we generate the following set of equations:

\[3k_1+2k_2+1k_3=2.51\\5k_1-5k_2+7k_3=3.82\\6k_1-6k_2-7k_3=4.43\]

where, for example, for the first equation, a rotation of 3 radians/second about the primary axis, 2 radians/second about axis 2, and 1 radian/second about axis 3 generated a voltage of 2.51 volts.

There is a matrix version of this system of equations:

\[\begin{bmatrix}3&2&1\\5&-5&7\\6&-6&-7\end{bmatrix}\begin{bmatrix}k_1\\k_2\\k_3\end{bmatrix}=\begin{bmatrix}2.51\\3.82\\4.43\end{bmatrix}\]

Without going into the rules of matrix algebra, I think you can see how each of the objects in the above equations were assembled: the array (matrix) of numbers are the coefficients of the unknowns, listed in the same order as in the system of equations. The vertical matrix next to it (also called a vector as it only has the one column) lists the unknowns, and the matrix on the right side is the numbers on the right side of the system. If we let A be the matrix of coefficients, k be the matrix of unknowns, and b be the matrix of the right side numbers, the matrix equation and its solution is:

\[\textbf{Ak}=\textbf{b}\Longrightarrow\textbf{k}=\textbf{A}^{-1}\textbf{b}\]

On a CAS calculator, this is solved very quickly. The answer will be the matrix k and you just pick off the elements for each ki in the same order as in the setup of the matrix equation. Doing this on my CAS, I get the answer :

\[\textbf{k}=\begin{bmatrix}0.8\\0.05\\0.01\end{bmatrix}\]

So the output model for this gyro is

\[0.8\omega_1+0.05\omega_2+0.01\omega_3=v\]

Given the corrected outputs of the other two gyros, the navigation system knows what the true rotation about axis 1 is.

Rotation matrix

Another “rocket scientist” application of matrices associated with navigation are rotation matrices. Supposed your spacecraft is halfway between earth and the moon and a course correction is needed. In which direction do you burn your rocket engines? If you just point your spacecraft toward the moon and fire, you will miss by a lot. This is because you are not taking your spacecraft’s and the moon’s motion into account. And when I say “motion”, I mean with respect to an inertial (that is static) reference coordinate system, called an inertial reference frame. Depending on the scenario, this could be a system at the centre of the earth or the sun that does not rotate with the earth or sun.

So calculations are made with respect to the inertial frame but the spacecraft’s navigation system only knows its reference frame when it comes to firing the engines. Rotation matrices is how an inertial direction is converted to a spacecraft’s reference frame.

To simplify this a bit, I will limit the coordinate systems to be 2-dimensional, but the concept can easily be extended to 3 dimensions.

Consider the two coordinate systems below:

where the x-y system is the inertial frame and the x’-y’ system is the spacecraft reference frame. Now I know that the origins of these two frames will be physically separated, but it turns out that this does not matter. Only the angle (angles for 3-dimensions) between the two frames matter. I draw them together so you can more easily see the result.

Now suppose that, with respect to the inertial frame which does take into account the motions of the spacecraft and the moon, it is calculated that the direction the engines should fire for the course correction is v as indicated in the diagram. This direction has an x value and a y value which corresponds to where the arrow of v is. The spacecraft has to convert this to its coordinate system and it does it with the following matrix multiplication:

\[\begin{bmatrix}x’\\y’\end{bmatrix}=\begin{bmatrix}\text{cos}(\theta)&\text{sin}(\theta)\\-\text{sin}(\theta)&\text{cos}(\theta)\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix}\]

The angle πœƒ is known from the various navigation systems that are being used to track the spacecraft, including an inertial navigation system that has gyroscopes that we just modelled. Now the spacecraft knows in what direction it has to fire its engines.

Other uses

In the previous example, the navigation computers needed to have a model of all the forces acting on the spacecraft and the resulting motions, that is its dynamics, in order to calculate the vector v. This model is a matrix differential equation that is constantly being numerically solved. To do this, the computer needs to keep track of the spacecraft’s position in 3-dimensional space (3 variables) as well as the velocity in each of those dimensions (3 more variables). The matrix needed for that model has to have the same number of rows and columns as the number of variables, so that is a 6 Γ— 6 matrix (36 elements).

Matrices are used in electronics, optics, quantum mechanics, cryptography, robotics – you get the point.

Engineering Topics – Complex Numbers

The world moves and so do many things that occur naturally or are made by humans. Because of this, when engineers or scientists want to mathematically model a system or process, they frequently need to identify not only the variables of interest (like position), but the rate of change of these variables (like velocity). These mathematical models are equations that relate the variables and their rates of change. Equations like this are called differential equations and how to solve these equations is usually introduced to students after they have studied calculus. And solutions to these equations frequently have complex numbers in them.

What is a complex number? A complex number is defined using the definition of the imaginary unit, i:

\[i=\sqrt{-1}\]

This may look like a crime against mathematics as through much of our maths education, we were told that you cannot take the square root of a negative number. This is reinforced by trying to take the square root of a negative number on many calculators resulting in an error. But it turned out that this invention had some usefulness in maths.

Numbers like bi where b is a real number are called imaginary numbers. If you add a real number a to this, a + bi, you get a complex number, where a is the real part and b is the imaginary part. There is a lot of theory surrounding complex numbers, but I will only cover what is necessary for this post.

Complex numbers satisfy many of the properties you are familiar with using real numbers including the rules involving exponents. So using a complex number as an exponent to the natural base e (a number, like πœ‹, which is frequently used in engineering), the expression can be split into two parts:

\[e^{a+bi}=e^ae^{bi}\]

You are familiar with ea, but what do we do with ebi? This is actually a complex number as well and can be put into a standard form using Euler’s formula:

\[e^{bi}=\text{cos}\,b+i\,\text{sin}\,b\]

In textbooks, the right side of the above equation is abbreviated as cis b. So

\[e^{a+bi}=e^a(\text{cos}\,b+i\,\text{sin}\,b)=e^a\,\text{cis}\,b\]

So what can imaginary numbers tell us about the real world? Well, the “useless” things you were taught about quadratic equations, are about to become useful.

The Damped Harmonic Oscillator

There are many things that oscillate, but not forever: electronic circuits, your car when it hits a bump, aircraft when they hit an air pocket. A simple example is a mass on a spring with a damper attached:

Modified image from https://commons.wikimedia.org/

There’s a lot of physics happening here. But all you need to know is that there are 3 main forces affecting the motion of the mass. These forces are created by: gravity (which creates an acceleration downward), spring (proportional to the position of the mass), and damper (proportional to the velocity of the mass). What is the position of the mass at any time t?

Using Newton’s second law, F = ma, the following differential equation can be generated. A differential equation is an equation that relates a variable (in this case, the position x) with its rates of change (in this case velocity and acceleration):

\[m\ddot{x}+b\dot{x}+kx=0\]

where x is the position of the mass relative to a reference point, b is the damping coefficient (how strong is the damper), k is the spring constant (how strong is the spring), x with one dot above it is the rate of change of x with respect to time (commonly known as velocity), and x with two dots is the rate of change of velocity with respect to time (commonly known as acceleration).

Solving differential equations is a whole university course, but for this type of equation, the solution will be of the form:

\[x(t)=Ae^{πœ†t}\]

where A is the initial position of the mass at t = 0. So the problem reduces down to finding πœ†. It turns out that πœ† is the solution to the corresponding algebraic equation (called the characteristic equation):

\[m\lambda^2+b\lambda+k=0\]

So yes, here is an example where you use the quadratic skills you learned. Using the quadratic formula:

\[\lambda=\frac{-b\pm\sqrt{b^2-4mk}}{2m}\]

The discriminant b2 – 4mk, dictates the type of solutions for πœ†. In this post, I am interested in the case where b2 – 4mk < 0, the under damped case which is graphically shown in the animation above.

With a little bit of algebra and using the definition of i to factor out the βˆ’1 inside the square root (and what remains in the square root is positive):

\[\lambda=\frac{-b\pm\sqrt{(-1)(4mk-b^2)}}{2m}=\frac{-b}{2m}\pm i\sqrt{\frac{k}{m}-\frac{b^2}{4m^2}}\]

If we let

\[\alpha=\frac{b}{2m}\text{  and  }\omega=\sqrt{\frac{k}{m}-\frac{b^2}{4m^2}}\]

then πœ† = -⍺ Β± Ο‰i. So the solution is

\[x(t)=Ae^{(-\alpha\pm \omega i)t}=Ae^{-\alpha t}e^{(\pm\omega ti)}=Ae^{-\alpha t}(\text{cos}(\pm \omega t)+i\,\text{sin}(\pm\omega t))\]

Now it looks like we still have an imaginary part in the answer. We need a real solution that fits the real world. As I said before, solving differential equations is a separate subject usually studied at uni. In that subject, you would learn about the superposition principle where any linear combination of two separate solutions of a differential equation will also be a solution. Notice that we do in fact, have two solutions above: one using the + and the other using the βˆ’. Using relationships that exist for circular (trig) functions for sine and cosine, we can add these two solutions together and the imaginary part will cancel out, leaving only a real solution:

\[x(t)=Ae^{-\alpha t}\text{cos}(\omega t)\]

As t grows, the cosine part of the solution just bounces up and down between Β±A. But the exponent of e gets more negative as t grows making eβˆ’βΊt smaller, starting at 1 when t = 0. This generates the following curve:

Notice how the exponential part of the solution is an envelope that the cosine curve must fit into.

Instead of a mass hanging on a spring, the differential equation we started out with could represent your car’s suspension system which has springs and shock absorbers (dampers). Notice that we can change the parameters, the strengths of the springs and shock absorbers, to change the way a car handles bumps. The response in the graph above may be too loose and we may want to change the parameters to make the car settle down more quickly. This is engineering.