Distinctions

The "Distinctions" datasets are multi-dimensional arrays of data on indigenous languages, obtained from the 2001, 2006, 2011 and 2016 censuses.

There are currently four different datasets.  Each one was created by merging Beyond 2020 datafiles into a single table, using Microsoft Access.  The single table was then "pivoted" on the various columns (dimensions), using custom SQL statements, to create simple views of the data, suitable for delivery as MS Excel files.

The different dimensions are summarised for each of the four datasets, 1, 2, 3 and 4, in the table below.  The numbers in each cell correspond to the size of the dimension. Dataset #4 includes additional "recode" records for the language dimension.

N Dimension 1 2 3 4 Notes
1 Census Year 3 1 3 4  
2 Language 44, 46, 77 77 3 45, 47+2, 77+44, 86+55 Number of languages varies by Census Year;
3 Geography 17 169 164, 166, 169 17 Number of CMAs varies by Census Year
4 Sex 3 3 3 3  
5 Aboriginal Identity 8 5 8 8  
6 Registered Status 3   3 3  
7 Residence 7 7 7 7  
8 Age Groups 9     8  
9 Mother Tongue 3 3 1 3  
10 Language Knowledge 4 4   4  
11 Home Language A 4 4   4  
12 Home Language B 4 4   4  
Most of the dimensions are identical across all four censuses.  The language dimension evolves steadily from one census to the next, which makes it difficult to compare data from different censuses.  However, a subset of 32 cases can be extracted from the language dimension, which are consistent, and thus comparable, across all three censuses.  Similarly, the number of CMAs changes gradually from one census to the next, with some disappearing, and new ones appearing.  There are 176 distinct CMAs across the three censuses 2001-2011.  Dataset #4 includes an additional language category for 2001 and 2006: "Subtotal Dene and Chipewyan".  The different dimensions are summarised here.

Cross-sections through the multi-dimensional data are presented here as a series of large MS-Excel spreadsheets.  In every case, dimensions 9 to 12 (MT, KN, HLA, HLB) are held constant with values of "1" (= total).  24 variables are obtained either directly from the original data, or by simple calculation:

N Var Type Data Type
1 Tot Pop
2 Tot AvAge
3 All Pop
4 All AvAge
5 All Pct
6 MT Pop
7 MT AvAge
8 MT Pct
9 KN Pop
10 KN AvAge
11 KN Pct
12 HLA Pop
13 HLA AvAge
14 HLA Pct
15 HLB Pop
16 HLB AvAge
17 HLB Pct
18 HLAB Pop
19 HLAB AvAge
20 HLAB Pct
21 SLA Index
22 CI Index
23 HLA Index
24 HLB Index

In the above table, "Tot" refers to the Total Population, obtained by specifying Lang_ID = 1 and AbIdent_ID = 1, whilst "All" refers to the population with Lang_ID = 1, but AbIdent_ID variable.


Practical constraints (most critically, a 255-column limit for tables in MS Access) limit the number of cases that can be displayed in a cross-tabulation to 9.  This led to some columns being dropped for the "Geography" (17 cases in set A; over 100 in sets B and C) and "Language" (32 cases) dimensions.  For the geography dimension for set A, data for NL, NS, PE, NB, AB, YT, NT and NU (8 cases out of 17) are missing from the MS Excel spreadsheet.  For sets B and C, a sampling of 24 CMAs in a total of three spreadsheets is presented.  For the language dimension, two spreadsheets were created.  The first spreadsheet contains data for 8 language families, whilst the second spreadsheet contains data for 8 individual languages. Both spreadsheets include data for "Total Aboriginal Languages".


MS Excel spreadsheets

N Dimension 1 2 3 4
1 Census Year Link   Link Link
2A Language (A) Link Link Link Link
2B Language (B) Link Link    
3A Geography (A) Link Link Link Link
3B Geography (B)   Link Link Link
3C Geography (C)   Link Link Link
4 Sex Link Link Link Link
5 Aboriginal Identity Link Link Link Link
6 Registered Status Link   Link Link
7 Residence Link Link Link Link
8 Age Groups Link      

Sample table

This sample table corresponds to a subset of data from the "Sex" spreadsheet in dataset 1, with the following filters applied:

  • Lang_Alt_txt = (03) Aboriginal
  • Geog_txt = (01) Canada
  • AbIdent_txt = (2) Ab
  • Regist_txt = (1) Total
  • AgeGroup_txt = (01) Total

The first column in the sample table is a cross-reference to the tables in an appendix of the following report:

Norris, Mary Jane (2017). Trends in the State of Indigenous Languages in Canada, 1996 to 2011: Selected Characteristics: Age, Gender, Place of Residence. Report prepared under contract with the Departments of Indigenous and Northern Affairs Canada and Canadian Heritage.

Date modified: