Distinctions

The "Distinctions" datasets are multi-dimensional arrays of data on indigenous languages, obtained from the 2001, 2006, 2011 and 2016 censuses.

There are currently four different datasets. Each one was created by merging Beyond 2020 datafiles into a single table, using Microsoft Access. The single table was then "pivoted" on the various columns (dimensions), using custom SQL statements, to create simple views of the data, suitable for delivery as MS Excel files.

The different dimensions are summarised for each of the four datasets, 1, 2, 3 and 4, in the table below. The numbers in each cell correspond to the size of the dimension. Dataset #4 includes additional "recode" records for the language dimension.

N	Dimension	1	2	3	4	Notes
1	Census Year	3	1	3	4
2	Language	44, 46, 77	77	3	45, 47+2, 77+44, 86+55	Number of languages varies by Census Year;
3	Geography	17	169	164, 166, 169	17	Number of CMAs varies by Census Year
4	Sex	3	3	3	3
5	Aboriginal Identity	8	5	8	8
6	Registered Status	3		3	3
7	Residence	7	7	7	7
8	Age Groups	9			8
9	Mother Tongue	3	3	1	3
10	Language Knowledge	4	4		4
11	Home Language A	4	4		4
12	Home Language B	4	4		4

Most of the dimensions are identical across all four censuses. The language dimension evolves steadily from one census to the next, which makes it difficult to compare data from different censuses. However, a subset of 32 cases can be extracted from the language dimension, which are consistent, and thus comparable, across all three censuses. Similarly, the number of CMAs changes gradually from one census to the next, with some disappearing, and new ones appearing. There are 176 distinct CMAs across the three censuses 2001-2011. Dataset #4 includes an additional language category for 2001 and 2006: "Subtotal Dene and Chipewyan". The different dimensions are summarised here.

Cross-sections through the multi-dimensional data are presented here as a series of large MS-Excel spreadsheets. In every case, dimensions 9 to 12 (MT, KN, HLA, HLB) are held constant with values of "1" (= total). 24 variables are obtained either directly from the original data, or by simple calculation:

N	Var Type	Data Type
1	Tot	Pop
2	Tot	AvAge
3	All	Pop
4	All	AvAge
5	All	Pct
6	MT	Pop
7	MT	AvAge
8	MT	Pct
9	KN	Pop
10	KN	AvAge
11	KN	Pct
12	HLA	Pop
13	HLA	AvAge
14	HLA	Pct
15	HLB	Pop
16	HLB	AvAge
17	HLB	Pct
18	HLAB	Pop
19	HLAB	AvAge
20	HLAB	Pct
21	SLA	Index
22	CI	Index
23	HLA	Index
24	HLB	Index

In the above table, "Tot" refers to the Total Population, obtained by specifying Lang_ID = 1 and AbIdent_ID = 1, whilst "All" refers to the population with Lang_ID = 1, but AbIdent_ID variable.

Practical constraints (most critically, a 255-column limit for tables in MS Access) limit the number of cases that can be displayed in a cross-tabulation to 9. This led to some columns being dropped for the "Geography" (17 cases in set A; over 100 in sets B and C) and "Language" (32 cases) dimensions. For the geography dimension for set A, data for NL, NS, PE, NB, AB, YT, NT and NU (8 cases out of 17) are missing from the MS Excel spreadsheet. For sets B and C, a sampling of 24 CMAs in a total of three spreadsheets is presented. For the language dimension, two spreadsheets were created. The first spreadsheet contains data for 8 language families, whilst the second spreadsheet contains data for 8 individual languages. Both spreadsheets include data for "Total Aboriginal Languages".

MS Excel spreadsheets

N	Dimension	1	2	3	4
1	Census Year	Link		Link	Link
2A	Language (A)	Link	Link	Link	Link
2B	Language (B)	Link	Link
3A	Geography (A)	Link	Link	Link	Link
3B	Geography (B)		Link	Link	Link
3C	Geography (C)		Link	Link	Link
4	Sex	Link	Link	Link	Link
5	Aboriginal Identity	Link	Link	Link	Link
6	Registered Status	Link		Link	Link
7	Residence	Link	Link	Link	Link
8	Age Groups	Link

Sample table

This sample table corresponds to a subset of data from the "Sex" spreadsheet in dataset 1, with the following filters applied:

Lang_Alt_txt = (03) Aboriginal
Geog_txt = (01) Canada
AbIdent_txt = (2) Ab
Regist_txt = (1) Total
AgeGroup_txt = (01) Total

The first column in the sample table is a cross-reference to the tables in an appendix of the following report:

Norris, Mary Jane (2017). Trends in the State of Indigenous Languages in Canada, 1996 to 2011: Selected Characteristics: Age, Gender, Place of Residence. Report prepared under contract with the Departments of Indigenous and Northern Affairs Canada and Canadian Heritage.

Date modified:: 2023-02-24