SLIDE 1 Adapted Bases and Fast Transforms
Michael Orrison
Harvey Mudd College
Joint Work with Michael Hansen and Masanori Koyama
SLIDE 2 Functions Let X = {x1, . . . , xn} be a finite set, and let CX be the complex vector space of complex-valued functions defined on X: CX = {f : X → C}. We identify x ∈ X with the function that is 1 on x and 0 on all of
Using the xi as a basis, we can then encode f ∈ CX as a column vector: f → f (x1) f (x2) . . . f (xn) . These column vectors represent the data we would like to analyze.
SLIDE 3 Actions If G is a finite group acting on X, then G acts on CX where if g ∈ G and f ∈ CX, then (g · f )(x) = f (g−1x). We can then extend this action to the group algebra CG, which makes CX a CG-permutation module. In the background is a permutation representation ϕ : G → GL|X|(C) where ϕ(g) is a permutation matrix for all g ∈ G that encodes the action of G on X: [ϕ(g)]ij =
if g · xj = xi
SLIDE 4 The Big Idea Let G be a finite group acting on X, and let CX be the resulting permutation module. If we write CX as a direct sum CX = U1 ⊕ · · · ⊕ Um
- f submodules, then every f ∈ CX can be written uniquely as
f = f1 + · · · + fm where fi ∈ Ui. If the Ui are meaningful, then we might be able to better understand f by focusing our attention on the fi. This leads to generalized spectral analysis, which was pioneered by Diaconis.
SLIDE 5
Example If G = S3 and X is the set {1, 2, 3}, then under the usual action of S3, we have CX = 1 1 1 ⊕ 1 −1 , 1 −1 . For example, 11 10 12 = 11 11 11 + −1 1 and 20 3 10 = 11 11 11 + 9 −8 −1 .
SLIDE 6
Example If G = X = Z/4Z = {1, z, z2, z3} acts on itself by left multiplication, then we have CX = 1 1 1 1 ⊕ 1 i −1 −i ⊕ 1 −1 1 −1 ⊕ 1 −i −1 i . Note that the associated permutation representation is such that z → 1 1 1 1 .
SLIDE 7
Example If G = Sn and X is the set of k-element subsets of {1, . . . , n} where k ≤ n/2, then we can write CX = U0 ⊕ U1 ⊕ · · · ⊕ Uk where Ui corresponds to pure i-th order effects, and f ∈ CX is typically viewed as voting data. Given f = f0 + f1 + · · · + fk, we might ask about the extent to which ||fi|| depends on f0, · · · , fi−1. (See Algebraic algorithms for sampling from conditional distributions by Diaconis and Sturmfels in Ann. Statist. Volume 26, Number 1 (1998), 363-397.)
SLIDE 8
Example If G = S3 and X is the set {1, 2, 3}, then under the usual action of S1, S2, and S3, we have CX = 1 ⊕ 1 ⊕ 1 = 1 1 ⊕ 1 −1 ⊕ 1 = 1 1 1 ⊕ 1 −1 , 1 1 −2 where these decompositions also reflect the different orbits and a certain kind of adaptedness.
SLIDE 9 Questions Suppose CX = U1 ⊕ · · · ⊕ Um where the Ui are CG-submodules.
1 Given f ∈ CX, how efficiently can we compute f1, . . . , fm?
How efficiently can we compute f1, . . . , fm?
2 Given a basis Bi for each Ui, how efficiently can we do a
change-of-basis from X to the basis B = B1 ∪ · · · ∪ Bm?
3 How should the Ui and the Bi be chosen above so as to be
meaningful and also computationally helpful?
SLIDE 10
SLIDE 11
Machinery
SLIDE 12
Discrete Fourier Transforms Every complex group algebra CG is isomorphic to a direct sum of matrix algebras: CG ∼ = Cd1×d1 ⊕ · · · ⊕ Cdh×dh. Any associated isomorphism D = D1 ⊕ · · · ⊕ Dh is a (generalized) discrete Fourier transform or DFT. Note that the Di form a complete set of irreducible representations for G, and any complete set of irreducible representations for G can be used in this way to construct a DFT for G.
SLIDE 13 Example D : CS3 → C1×1 ⊕ C1×1 ⊕ C2×2 =
· · · ·
1 1
11 →
21 →
1
11 + b3 22 →
1 1
SLIDE 14 Subgroup-Adapted DFTs Suppose H ≤ G. The DFT D for G is subgroup-adapted to the chain H ≤ G if for each irreducible representation Di of G and for all h ∈ H,
1 Di(h) is block diagonal, where the blocks correspond to
irreducible representations of H, and
2 equivalent blocks among all of the Di are actually equal.
This can also be extended to longer chains subgroups of G, which we’ll usually take to have the form {1} = G0 < G1 < · · · < Gn = G.
SLIDE 15 Example CS2 ∼ = C1×1 ⊕ C1×1 CS3 ∼ = C1×1 ⊕ C1×1 ⊕ C2×2 D : CS3 → C1×1 ⊕ C1×1 ⊕ C2×2 D|CS2 : CS2 →
⋆
SLIDE 16
Adapted Bases A basis B for CX is adapted to the DFT D if it can be partitioned B = B1 ∪ · · · ∪ Bk so that each Bj spans an irreducible submodule of CX, and if this submodule corresponds to Di, then [g]Bj = Di(g) for all g ∈ G.
SLIDE 17
Example If G = S3 and X is the set {1, 2, 3}, then under the usual action of S1, S2, and S3, we have CX = 1 ⊕ 1 ⊕ 1 = 1 1 ⊕ 1 −1 ⊕ 1 = 1 1 1 ⊕ 1 −1 , 1 1 −2 .
SLIDE 18 Creating Adapted Bases Let D : CG → Cd1×d1 ⊕ · · · ⊕ Cdh×dh be a DFT. Let bk
ij be the
unique element in CG such that D(bk
ij) has zeros everywhere
except for a 1 in the (i, j) entry of Dk. Then the collection {bk
ij} of
all such elements forms an adapted basis for CG called the dual matrix coefficient basis. More generally, if G acts transitively on X, and x ∈ X, then {g · x}g∈G is clearly a spanning set for CX, but {bk
ij · x}
is spanning set for CX that contains an adapted basis as a subset. This is how we will create adapted bases, but note that the choice
SLIDE 19 Frequency Subspaces The elements of the form bk
ii are primitive idempotents for CG,
and the subspace bk
ii · CX
is the associated frequency space of CX. Note that (bk
11 + · · · + bk dkdk) · CX
is the isotypic subspace corresponding to the representation Dk. Key Idea: If D is adapted to the chain H ≤ G, then the frequency spaces of CX with respect to CH are direct sums of the frequency spaces with respect to CG.
SLIDE 20
Example If G = X = Z/4Z = {1, z, z2, z3} and H = 2Z/4Z, then CX = 1 1 ⊕ 1 1 ⊕ 1 −1 ⊕ 1 −1 = 1 1 1 1 ⊕ 1 −1 1 −1 ⊕ 1 i −1 −i ⊕ 1 −i −1 i .
SLIDE 21
Fast Transforms
SLIDE 22 Change-of-Basis Suppose D is adapted to the chain H ≤ G. Although G acts on X transitively, the restriction to H need not be transitive, so X might be a union X = X1 ∪ · · · ∪ Xt
- f orbits of X with respect to H. Suppose you have a basis B′ for
CX that respects this partitition and is adapted to the associated irreducible representations of H. Questions: How difficult is it to do a change-of-basis from B′ to an adapted basis B with respect to the action of G? Can we bound the number of nonzero entries in the associated change-of-basis matrix?
SLIDE 23 Bounds Based on Frequency Subspaces If the frequency spaces with respect to H have dimensions α1, . . . , αf , then we will have no more than α2
1 + · · · + α2 f
nonzero entries in the associated change-of-basis matrix. This can be used to show that if X = G and the Xi are the right cosets of H, then the above sum becomes
h
([G : H]di)2 di where the sum is over all of the irreducible degrees d1, . . . , dh.
SLIDE 24 Suppose H ≤ G, and that d1, . . . , dk are the dimensions of the irreducible representations of H, and let d3(H) = d3
i . Suppose
we have {1} = G0 < G1 < · · · < Gn = G and set qj = [Gj : Gj−1]. Then the step from Gj−1 to Gj requires no more than the following number of nonzeros: [Gj : Gj−1]2d3(Gj−1)[G : Gj] = q2
j qj+1 · · · qn d3(Gj−1).
Thus we need no more than
n
q2
j qj+1 · · · qn d3(Gj−1)
nonzeros, which is also the bound given in Theorem 7.6 of Clausen and Baum’s book Fast Fourier Transforms.
SLIDE 25 Two-Sided Attack If we are dealing with the regular module CG, we can take advantage of the fact that we can act on both the left and the right by subgroups K and H. Theorem (Mackey’s Theorem): As a (CK, CH)-bimodule, CG ∼ =
CK ⊗CHg CH where the direct sum is taken over a complete set of double coset representatives.
SLIDE 26
Doubly Adapted Basis for the Symmetric Group Let Bn denote the dual matrix coefficient basis for the symmetric group Sn with respect to the orthogonal form of the irreducible representations of Sn. Theorem: If h ≤ k ≤ n, and g is the shortest element in the double coset ShgSk in Sn, then the nonzero elements of {bgb′ | b ∈ Bh and b′ ∈ Bk} form an orthogonal adapted basis for C(ShgSk) for both the left action of CSh and the right action of CSk.
SLIDE 27 Fast Fourier Transform Using such bases with respect to the chain (S1, S1) ≤ (S2, S1) ≤ (S2, S2) ≤ · · · ≤ (Sn−1, Sn−1) ≤ (Sn, Sn−1) has allowed us to create what we think is a new FFT for the family
For n ≤ 18, the number of nonzero entries required in the matrix factorization of the DFT for Sn is less than n2n!, which makes our algorithm competitive with Maslen’s FFT, which has complexity O(n2n!). The next step is to better understand the orbit/frequency relationship at play in this setup.
SLIDE 28 Questions
1 What are some groups and sets for which these frequency
spaces lead to efficient decomposition algorithms?
2 How do the choices for the x in the spanning sets {bk ij · x}
affect the resulting change-of-basis algorithms?
3 Where might we apply these insights and algorithms?
(Statistics, Algebraic Statistics, Machine Learning, etc.)
SLIDE 29
SLIDE 30