Lp space

In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue (Dunford & Schwartz 1958, III.3), although according to the Bourbaki group (Bourbaki 1987) they were first introduced by Frigyes Riesz (Riesz 1910).

Lp spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Because of their key role in the mathematical analysis of measure and probability spaces, Lebesgue spaces are used also in the theoretical discussion of problems in physics, statistics, economics, finance, engineering, and other disciplines.

Applications

[edit]

Statistics

[edit]

In statistics, measures of central tendency and statistical dispersion, such as the mean, median, and standard deviation, can be defined in terms of metrics, and measures of central tendency can be characterized as solutions to variational problems.

In penalized regression, "L1 penalty" and "L2 penalty" refer to penalizing either the norm of a solution's vector of parameter values (i.e. the sum of its absolute values), or its squared norm (its Euclidean length). Techniques which use an L1 penalty, like LASSO, encourage sparse solutions (where the many parameters are zero).[1] Elastic net regularization uses a penalty term that is a combination of the norm and the squared norm of the parameter vector.

Hausdorff–Young inequality

[edit]

The Fourier transform for the real line (or, for periodic functions, see Fourier series), maps to (or to ) respectively, where and This is a consequence of the Riesz–Thorin interpolation theorem, and is made precise with the Hausdorff–Young inequality.

By contrast, if the Fourier transform does not map into

Hilbert spaces

[edit]

Hilbert spaces are central to many applications, from quantum mechanics to stochastic calculus. The spaces and are both Hilbert spaces. In fact, by choosing a Hilbert basis i.e., a maximal orthonormal subset of or any Hilbert space, one sees that every Hilbert space is isometrically isomorphic to (same as above), i.e., a Hilbert space of type

The p-norm in finite dimensions

[edit]
Illustrations of unit circles (see also superellipse) in based on different -norms (every vector from the origin to the unit circle has a length of one, the length being calculated with length-formula of the corresponding ).

The Euclidean length of a vector in the -dimensional real vector space is given by the Euclidean norm:

The Euclidean distance between two points and is the length of the straight line between the two points. In many situations, the Euclidean distance is appropriate for capturing the actual distances in a given space. In contrast, consider taxi drivers in a grid street plan who should measure distance not in terms of the length of the straight line to their destination, but in terms of the rectilinear distance, which takes into account that streets are either orthogonal or parallel to each other. The class of -norms generalizes these two examples and has an abundance of applications in many parts of mathematics, physics, and computer science.

Definition

[edit]

For a real number the -norm or -norm of is defined by The absolute value bars can be dropped when is a rational number with an even numerator in its reduced form, and is drawn from the set of real numbers, or one of its subsets.

The Euclidean norm from above falls into this class and is the -norm, and the -norm is the norm that corresponds to the rectilinear distance.

The -norm or maximum norm (or uniform norm) is the limit of the -norms for It turns out that this limit is equivalent to the following definition:

See L-infinity.

For all the -norms and maximum norm as defined above indeed satisfy the properties of a "length function" (or norm), which are that:

  • only the zero vector has zero length,
  • the length of the vector is positive homogeneous with respect to multiplication by a scalar (positive homogeneity), and
  • the length of the sum of two vectors is no larger than the sum of lengths of the vectors (triangle inequality).

Abstractly speaking, this means that together with the -norm is a normed vector space. Moreover, it turns out that this space is complete, thus making it a Banach space. This Banach space is the -space over

Relations between p-norms

[edit]

The grid distance or rectilinear distance (sometimes called the "Manhattan distance") between two points is never shorter than the length of the line segment between them (the Euclidean or "as the crow flies" distance). Formally, this means that the Euclidean norm of any vector is bounded by its 1-norm:

This fact generalizes to -norms in that the -norm of any given vector does not grow with :

for any vector and real numbers and (In fact this remains true for and .)

For the opposite direction, the following relation between the -norm and the -norm is known:

This inequality depends on the dimension of the underlying vector space and follows directly from the Cauchy–Schwarz inequality.

In general, for vectors in where

This is a consequence of Hölder's inequality.

When 0 < p < 1

[edit]
Astroid, unit circle in metric

In for the formula defines an absolutely homogeneous function for however, the resulting function does not define a norm, because it is not subadditive. On the other hand, the formula defines a subadditive function at the cost of losing absolute homogeneity. It does define an F-norm, though, which is homogeneous of degree

Hence, the function defines a metric. The metric space is denoted by

Although the -unit ball around the origin in this metric is "concave", the topology defined on by the metric is the usual vector space topology of hence is a locally convex topological vector space. Beyond this qualitative statement, a quantitative way to measure the lack of convexity of is to denote by the smallest constant such that the scalar multiple of the -unit ball contains the convex hull of which is equal to The fact that for fixed we have shows that the infinite-dimensional sequence space defined below, is no longer locally convex.[citation needed]

When p = 0

[edit]

There is one norm and another function called the "norm" (with quotation marks).

The mathematical definition of the norm was established by Banach's Theory of Linear Operations. The space of sequences has a complete metric topology provided by the F-norm which is discussed by Stefan Rolewicz in Metric Linear Spaces.[2] The -normed space is studied in functional analysis, probability theory, and harmonic analysis.

Another function was called the "norm" by David Donoho—whose quotation marks warn that this function is not a proper norm—is the number of non-zero entries of the vector [citation needed] Many authors abuse terminology by omitting the quotation marks. Defining the zero "norm" of is equal to

An animated gif of p-norms 0.1 through 2 with a step of 0.05.
An animated gif of p-norms 0.1 through 2 with a step of 0.05.

This is not a norm because it is not homogeneous. For example, scaling the vector by a positive constant does not change the "norm". Despite these defects as a mathematical norm, the non-zero counting "norm" has uses in scientific computing, information theory, and statistics–notably in compressed sensing in signal processing and computational harmonic analysis. Despite not being a norm, the associated metric, known as Hamming distance, is a valid distance, since homogeneity is not required for distances.

The p-norm in infinite dimensions and p spaces

[edit]

The sequence space p

[edit]

The -norm can be extended to vectors that have an infinite number of components (sequences), which yields the space This contains as special cases:

  • the space of sequences whose series are absolutely convergent,
  • the space of square-summable sequences, which is a Hilbert space, and
  • the space of bounded sequences.

The space of sequences has a natural vector space structure by applying addition and scalar multiplication coordinate by coordinate. Explicitly, the vector sum and the scalar action for infinite sequences of real (or complex) numbers are given by:

Define the -norm:

Here, a complication arises, namely that the series on the right is not always convergent, so for example, the sequence made up of only ones, will have an infinite -norm for The space is then defined as the set of all infinite sequences of real (or complex) numbers such that the -norm is finite.

One can check that as increases, the set grows larger. For example, the sequence is not in but it is in for as the series diverges for (the harmonic series), but is convergent for

One also defines the -norm using the supremum: and the corresponding space of all bounded sequences. It turns out that[3] if the right-hand side is finite, or the left-hand side is infinite. Thus, we will consider spaces for

The -norm thus defined on is indeed a norm, and together with this norm is a Banach space. The fully general space is obtained—as seen below—by considering vectors, not only with finitely or countably-infinitely many components, but with "arbitrarily many components"; in other words, functions. An integral instead of a sum is used to define the -norm.

General ℓp-space

[edit]

In complete analogy to the preceding definition one can define the space over a general index set (and ) as where convergence on the right means that only countably many summands are nonzero (see also Unconditional convergence). With the norm the space becomes a Banach space. In the case where is finite with elements, this construction yields with the -norm defined above. If is countably infinite, this is exactly the sequence space defined above. For uncountable sets this is a non-separable Banach space which can be seen as the locally convex direct limit of -sequence spaces.[4]

For the -norm is even induced by a canonical inner product called the Euclidean inner product, which means that holds for all vectors This inner product can expressed in terms of the norm by using the polarization identity. On it can be defined by while for the space associated with a measure space which consists of all square-integrable functions, it is

Now consider the case Define[note 1] where for all [5][note 2]

The index set can be turned into a measure space by giving it the discrete σ-algebra and the counting measure. Then the space is just a special case of the more general -space (defined below).

Lp spaces and Lebesgue integrals

[edit]

An space may be defined as a space of measurable functions for which the -th power of the absolute value is Lebesgue integrable, where functions which agree almost everywhere are identified. More generally, let be a measure space and [note 3] When , consider the set of all measurable functions from to or whose absolute value raised to the -th power has a finite integral, or in symbols:

To define the set for recall that two functions and defined on are said to be equal almost everywhere, written a.e., if the set is measurable and has measure zero. Similarly, a measurable function (and its absolute value) is bounded (or dominated) almost everywhere by a real number written a.e., if the (necessarily) measurable set has measure zero. The space is the set of all measurable functions that are bounded almost everywhere (by some real ) and is defined as the infimum of these bounds: When then this is the same as the essential supremum of the absolute value of :[note 4]

For example, if is a measurable function that is equal to almost everywhere[note 5] then for every and thus for all

For every positive the value under of a measurable function and its absolute value are always the same (that is, for all ) and so a measurable function belongs to if and only if its absolute value does. Because of this, many formulas involving -norms are stated only for non-negative real-valued functions. Consider for example the identity which holds whenever is measurable, is real, and (here when ). The non-negativity requirement can be removed by substituting in for which gives Note in particular that when is finite then the formula relates the -norm to the -norm.

Seminormed space of -th power integrable functions

Each set of functions forms a vector space when addition and scalar multiplication are defined pointwise.[note 6] That the sum of two -th power integrable functions and is again -th power integrable follows from [proof 1] although it is also a consequence of Minkowski's inequality which establishes that satisfies the triangle inequality for (the triangle inequality does not hold for ). That is closed under scalar multiplication is due to being absolutely homogeneous, which means that for every scalar and every function

Absolute homogeneity, the triangle inequality, and non-negativity are the defining properties of a seminorm. Thus is a seminorm and the set of -th power integrable functions together with the function defines a seminormed vector space. In general, the seminorm is not a norm because there might exist measurable functions that satisfy but are not identically equal to [note 5] ( is a norm if and only if no such exists).

Zero sets of -seminorms

If is measurable and equals a.e. then for all positive On the other hand, if is a measurable function for which there exists some such that then almost everywhere. When is finite then this follows from the case and the formula mentioned above.

Thus if is positive and is any measurable function, then if and only if almost everywhere. Since the right hand side ( a.e.) does not mention it follows that all have the same zero set (it does not depend on ). So denote this common set by This set is a vector subspace of for every positive

Quotient vector space

Like every seminorm, the seminorm induces a norm (defined shortly) on the canonical quotient vector space of by its vector subspace This normed quotient space is called Lebesgue space and it is the subject of this article. We begin by defining the quotient vector space.

Given any the coset consists of all measurable functions that are equal to almost everywhere. The set of all cosets, typically denoted by forms a vector space with origin when vector addition and scalar multiplication are defined by and This particular quotient vector space will be denoted by

Two cosets are equal if and only if (or equivalently, ), which happens if and only if almost everywhere; if this is the case then and are identified in the quotient space.

The -norm on the quotient vector space

Given any the value of the seminorm on the coset is constant and equal to denote this unique value by so that: This assignment defines a map, which will also be denoted by on the quotient vector space This map is a norm on called the -norm. The value of a coset is independent of the particular function that was chosen to represent the coset, meaning that if is any coset then for every (since for every ).

The Lebesgue space

The normed vector space is called space or the Lebesgue space of -th power integrable functions and it is a Banach space for every (meaning that it is a complete metric space, a result that is sometimes called the Riesz–Fischer theorem). When the underlying measure space is understood then is often abbreviated or even just Depending on the author, the subscript notation might denote either or

If the seminorm on happens to be a norm (which happens if and only if ) then the normed space will be linearly isometrically isomorphic to the normed quotient space via the canonical map (since ); in other words, they will be, up to a linear isometry, the same normed space and so they may both be called " space".

The above definitions generalize to Bochner spaces.

In general, this process cannot be reversed: there is no consistent way to define a "canonical" representative of each coset of in For however, there is a theory of lifts enabling such recovery.

Special cases

[edit]

Similar to the spaces, is the only Hilbert space among spaces. In the complex case, the inner product on is defined by

The additional inner product structure allows for a richer theory, with applications to, for instance, Fourier series and quantum mechanics. Functions in are sometimes called square-integrable functions, quadratically integrable functions or square-summable functions, but sometimes these terms are reserved for functions that are square-integrable in some other sense, such as in the sense of a Riemann integral (Titchmarsh 1976).

If we use complex-valued functions, the space is a commutative C*-algebra with pointwise multiplication and conjugation. For many measure spaces, including all sigma-finite ones, it is in fact a commutative von Neumann algebra. An element of defines a bounded operator on any space by multiplication.

For the spaces are a special case of spaces, when consists of the natural numbers and is the counting measure on More generally, if one considers any set with the counting measure, the resulting space is denoted For example, the space is the space of all sequences indexed by the integers, and when defining the -norm on such a space, one sums over all the integers. The space where is the set with elements, is with its -norm as defined above. As any Hilbert space, every space is linearly isometric to a suitable where the cardinality of the set is the cardinality of an arbitrary Hilbertian basis for this particular

Properties of Lp spaces

[edit]

As in the discrete case, if there exists such that then[citation needed]

Hölder's inequality

Suppose satisfy (where ). If and then and[6]

This inequality, called Hölder's inequality, is in some sense optimal[6] since if (so ) and is a measurable function such that where the supremum is taken over the closed unit ball of then and

Minkowski inequality

Minkowski inequality, which states that satisfies the triangle inequality, can be generalized: If the measurable function is non-negative (where and are measure spaces) then for all [7]

Atomic decomposition

[edit]

If then every non-negative has an atomic decomposition,[8] meaning that there exist a sequence of non-negative real numbers and a sequence of non-negative functions called the atoms, whose supports are pairwise disjoint sets of measure such that and for every integer and and where moreover, the sequence of functions depends only on (it is independent of ).[8] These inequalities guarantee that for all integers while the supports of being pairwise disjoint implies[8]

An atomic decomposition can be explicitly given by first defining for every integer [8] (this infimum is attained by that is, holds) and then letting where denotes the measure of the set and denotes the indicator function of the set The sequence is decreasing and converges to as [8] Consequently, if then and so that is identically equal to (in particular, the division by causes no issues).

The complementary cumulative distribution function of that was used to def