What does it mean to transform as a scalar or vector?

Question

I'm working through an introductory electrodynamics text (Griffiths), and I encountered a pair of questions asking me to show that:

the divergence transforms as a scalar under rotations
  the gradient transforms as a vector under rotations

I can see how to show these things mathematically, but I'd like to gain some intuition about what it means to "transform as a" vector or scalar. I have found definitions, but none using notation consistent with the Griffiths book, so I was hoping for some confirmation.

My guess is that "transforms as a scalar" applies to a scalar field, e.g. $T(y,z)$ (working in two dimensions since the questions in the book are limited to two dimensions). 
It says that if you relabel all of the coordinates in the coordinate system using:
$$begin{pmatrix}bar{y}  bar{z}end{pmatrix} = begin{pmatrix}cosphi & sinphi  -sinphi & cosphiend{pmatrix} begin{pmatrix}y  zend{pmatrix}$$
so $(bar{y},bar{z})$ gives the relabeled coordinates for point $(y,z)$, then: $$bar{T}(bar{y},bar{z}) = T(y,z)$$
for all y, z in the coordinate system, where $bar{T}$ is the rotated scalar field. Then I thought perhaps I'm trying to show something like this?
$$overline{(nabla cdot T)}(bar{y},bar{z})=(nabla cdot T)(y,z) $$
where $overline{(nabla cdot T)}$ is the rotated gradient of $T$.

The notation above looks strange to me, so I'm wondering if it's correct. I'm also quite curious what the analogous formalization of "transforms as a vector field" would look like.

joshphysics · Answer

There are a number of ways of mathematically formalizing the notions "transforming as a vector" or "transforming as a scalar" depending on the context, but in the context you're considering, I'd recommend the following:

Consider a finite number of types of objects $o_1, dots, o_n$, each of which lives in some set $O_i$ of objects, and each of which is defined to transform in a particular way under rotations.  In other words, given any rotation $R$, and for each object $o_i$ we have a mapping when acting on objects in $O_i$ tells us what happens to them under a rotation $R$:
begin{align}
  o_i mapsto o_i^R = text{something we specify}
end{align}
For example, if $o_1$ is just a vector $mathbf r$ in three dimensional Euclidean space $mathbb R^3$, then one would typically take
begin{align}
  mathbf r mapsto mathbf r^R = Rmathbf r.
end{align}
Each mapping $o_imapsto o_i^R$ is what a mathematician would call a group action of the group of rotations on the set $O_i$ (there are more details in defining a group action which we ignore here).  Once we have specified how these different objects $o_i$ transform under rotations, we can make the following definition:

Definition. Scalar under rotations

Let any function $f:O_1times O_2timescdots times O_nto mathbb R$ be given, we say it is a scalar under rotations provided
begin{align}
  f(o_1^R, dots o_n^R) = f(o_1, dots o_n).
end{align}
This definition is intuitively just saying that if you "build" an object $f$ out of a bunch of other objects $o_i$ whose transformation under rotations you have already specified, then the new object $f$ which you have constructed is considered a scalar if it doesn't change when you apply a rotation to all of the objects it's built out of.

Example. The dot product

Let $n=2$, and let $o_1 = mathbf r_1$ and $o_2 = mathbf r_2$ both be vectors in $mathbb R^3$.  We define $f$ as follows:
begin{align}
  f(mathbf r_1, mathbf r_2) = mathbf r_1cdot mathbf r_2.
end{align}
Is $f$ a scalar under rotations?  Well let's see:
begin{align}
f(mathbf r_1^R, mathbf r_2^R) 
= (Rmathbf r_1)cdot (Rmathbf r_2) 
= mathbf r_1cdot (R^TRmathbf r_2) 
= mathbf r_1cdot mathbf r_2 
= f(mathbf r_1, mathbf r_2)
end{align}
so yes it is!

Now what about a field of scalars?  How do we define such a beast?  Well we just have to slightly modify the above definition.

Definition. Field of scalars

Let any function $f:O_1timescdots times O_ntimesmathbb R^3to mathbb R$ be given.  We call $f$ a field of scalars under rotations provided
begin{align}
  f(o_1^R, dots, o_n^R)(Rmathbf x) = f(mathbf x).
end{align}
You can think of this as simply saying that the rotated version of $f$ evaluated at the rotated point $Rmathbf x$ agrees with the unrotated version of $f$ evaluated at the unrotated point.  Notice that this is formally the same as the equation you wrote down, namely $bar T(bar x, bar y) = T(x,y)$.

Example. Divergence of a vector field

Consider the case that $mathbf v$ is a vector field.  Rotations are conventionally defined to act on vector fields as follows (I'll try to find another post on physics.SE that explains why):
begin{align}
  mathbf v^R(mathbf x) = Rmathbf v(R^{-1}mathbf x)
end{align}
Is its divergence a scalar field?  Well to make contact with the definition we give above, let $f$ denote the divergence, namely
begin{align}
  f(mathbf v)(mathbf x) = (nablacdot mathbf v)(mathbf x)
end{align}
Now notice that using the chain rule we get (we use Einstein summation notation)
begin{align}
  (nablacdotmathbf v^R)(mathbf x) 
&= nablacdotbig(Rmathbf v(R^{-1}mathbf x)big)
&= partial_i(R_{ij}v_j(R^{-1}mathbf x) 
&= R_{ij} partial_i(v_j(R^{-1}mathbf x)) 
&= R_{ij}(R^{-1})_{ki}(partial_k v_j)(R^{-1}mathbf x)
&= (nablacdot mathbf v)(R^{-1}mathbf x)
end{align}
which implies that
begin{align}
  (nablacdotmathbf v^R)(Rmathbf x) = (nablacdot mathbf v)(mathbf x),
end{align}
but the left hand side is precisely $f(mathbf v^R)(Rmathbf x)$ and the right side is $f(mathbf v)(mathbf x)$ so we have
begin{align}
  f(mathbf v^R)(Rmathbf x) = f(mathbf v)(mathbf x).
end{align}
This is precisely the condition that $f$ (the divergence of a vector field) be a scalar field under rotations.

Extension to vectors and vector fields.

To define a vector under rotations, and a field of vectors under rotations, we do a very similar procedure, but instead we have functions $mathbf f:O_1times O_2timescdots times O_nto mathbb R^3$ and $mathbf f:O_1times O_2timescdots times O_ntimesmathbb R^3to mathbb R^3$ respectively (in other words the right hand side of the arrow gets changed from $mathbb R$ to $mathbb R^3$, and the defining equations for a vector and a field of vectors become
begin{align}
  mathbf f(o_1^R, dots o_n^R) = R,mathbf f(o_1, dots o_n).
end{align}
and
begin{align}
mathbf f(o_1^R, dots, o_n^R)(Rmathbf x) = R ,mathbf f(mathbf x)
end{align}
respectively.  In other words, there is an extra $R$ multiplying the right hand side.

user10851 · Answer

I think you have the right idea, but I'll try to write it in a more elucidating notation.

The first thing to make clear is that for this discussion we are only ever working at a single point. We only care about transforming the coordinates that describe the domain to the extent this induces changes in the associated directions at a point. That is, each point in space can have vectors defined on it, and a very convenient basis for the vector space at that point is the set of directional derivatives, e.g.
$$ mathcal{B} = {vec{partial}_{(x)}, vec{partial}_{(y)}, ldots}. $$
$vec{partial}_{(x)}$ points in the $x$-direction; call it $vec{e}_x$ if you want. Changing ${x, y, ldots} to {bar{x}, bar{y}, ldots}$ will give us a new natural basis
$$ bar{mathcal{B}} = {vec{partial}_{(bar{x})}, vec{partial}_{(bar{x})}, ldots} $$
at each point.

The point of that discussion is that transformations are local. What numbers you use to identify the point in space are irrelevant, so don't get caught up on whether we're calling the point $(x,y)$ or $(bar{x},bar{y})$. Okay, enough allusions to differential geometry.

Let's look at scalars. A scalar is just a single number from your favorite mathematical field.1 What's more, it doesn't transform when the direction vectors change, since it carries no direction information anyway. If I have a scalar $f$, I could say its representation in either basis is the same:
$$ f stackrel{mathcal{B},mathcal{bar{B}}}{to} f. $$

Now consider a vector $vec{A}$. Since a vector can always be written uniquely as a linear combination of basis vectors, let's do that:
$$ vec{A} = A^x vec{partial}_{(x)} + A^y vec{partial}_{(y)} + cdots. $$
But there is another basis floating around, and so I have another decomposition available:
$$ vec{A} = A^bar{x} vec{partial}_{(bar{x})} + A^bar{y} vec{partial}_{(bar{y})} + cdots. $$
For simplicity, I can just write the coefficients when the basis is understood:
begin{align}
vec{A} & stackrel{mathcal{B}}{to} (A^x, A^y, ldots) 
vec{A} & stackrel{mathcal{bar{B}}}{to} (A^bar{x}, A^bar{y}, ldots).
end{align}

The numbers $A^x$, $A^y$, $A^bar{x}$, etc. are just scalars in the mathematical sense, but often we avoid calling them scalars. Instead, we call them components of a vector, and we expect them to collectively transform as a vector when we change basis. That is, if I switch from $mathcal{B}$ to $bar{mathcal{B}}$, I should rewrite $(A^x, A^y, ldots)$ as $(A^bar{x}, A^bar{y}, ldots)$ so that the collection of numbers still refers to the same abstract vector.

The actual transformation is simple enough to find. I can always express an element from one basis in terms of the other basis. Suppose for $j in {x, y, ldots}$ and $bar{imath} in {bar{x}, bar{y}, ldots}$ we have coefficients ${Lambda^bar{imath}}_j$ such that
$$ vec{partial}_{(j)} = sum_bar{imath} {Lambda^bar{imath}}_j vec{partial}_{(bar{imath})}. $$
Then
begin{align}
sum_bar{imath} A^bar{imath} vec{partial}_{(bar{imath})} & = vec{A} 
& = sum_j A^j vec{partial}_{(j)} 
& = sum_j A^j sum_bar{imath} {Lambda^bar{imath}}_j vec{partial}_{(bar{imath})} 
& = sum_bar{imath} sum_j {Lambda^bar{imath}}_j A^j vec{partial}_{(bar{imath})}.
end{align}
Because basis decompositions are unique, we can read off
$$ A^bar{imath} = sum_j {Lambda^bar{imath}}_j A^j. $$
In matrix notation, this is
$$ begin{pmatrix} A^bar{x}  A^bar{y}  vdots end{pmatrix} =
begin{pmatrix} {Lambda^bar{x}}_x & {Lambda^bar{x}}_y & cdots 
{Lambda^bar{y}}_x & {Lambda^bar{y}}_y & cdots 
vdots & vdots & ddots end{pmatrix}
begin{pmatrix} A^x  A^y  vdots end{pmatrix}. $$

When a physicist checks that $vec{A}$ transforms as a vector, what is usually meant is that we have one set of formulas for $A^x, A^y, ldots$ in $mathcal{B}$, and another set for calculating $A^bar{x}, A^bar{y}, ldots$ in $bar{mathcal{B}}$, and we want to make sure that the components are describing the same abstract vector $vec{A}$. This is the case if the sets of components transform according to the rule given above.

In your case, you may be handed the scalar $T$ (i.e. $f$ above). You can calculate the values $partial T/partial x$, $partial T/partial y$, etc. (Here is where the dependence on other points comes in, since you are often given $T$ as a function of the coordinates so that you can calculate its partial derivatives.) You can assemble the (column) vector $(partial T/partial x, partial T/partial y, ldots)$. You could do the same in another basis, with other partial derivatives, assembling $(partial T/partialbar{x}, partial T/partialbar{y}, ldots)$. It is not a priori clear, however, that these two sets of components will obey the above transformation law. Fortunately, though, the gradient $nabla T$ (i.e. $vec{A}$) defined this way is a true vector and it transforms correctly.

1$mathbb{R}$ or $mathbb{C}$ or whatever. Note when we say "field" in physics we often mean "function mapping the entire space in question to some sort of mathematical object." So a scalar field assigns a scalar to each point in space, a vector field assigns a vector, etc. But since we're only discussing what happens at a single point, the physicists' notion of "field" is not important here. If you really want to transform an entire scalar field or vector field, just take what's done here and apply it to every point in space.

What does it mean to transform as a scalar or vector?

2 Answers

Add your own answers!

Ask a Question