Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# SQL commands
<a name="c_SQL_commands"></a>

The SQL language consists of commands that you use to create and manipulate database objects, run queries, load tables, and modify the data in tables.

Amazon Redshift is based on PostgreSQL. Amazon Redshift and PostgreSQL have a number of important differences that you must be aware of as you design and develop your data warehouse applications. For more information about how Amazon Redshift SQL differs from PostgreSQL, see [Amazon Redshift and PostgreSQL](c_redshift-and-postgres-sql.md).

**Note**  
The maximum size for a single SQL statement is 16 MB.

**Topics**
+ [ABORT](r_ABORT.md)
+ [ALTER DATABASE](r_ALTER_DATABASE.md)
+ [ALTER DATASHARE](r_ALTER_DATASHARE.md)
+ [ALTER DEFAULT PRIVILEGES](r_ALTER_DEFAULT_PRIVILEGES.md)
+ [ALTER EXTERNAL SCHEMA](r_ALTER_EXTERNAL_SCHEMA.md)
+ [ALTER EXTERNAL VIEW](r_ALTER_EXTERNAL_VIEW.md)
+ [ALTER FUNCTION](r_ALTER_FUNCTION.md)
+ [ALTER GROUP](r_ALTER_GROUP.md)
+ [ALTER IDENTITY PROVIDER](r_ALTER_IDENTITY_PROVIDER.md)
+ [ALTER MASKING POLICY](r_ALTER_MASKING_POLICY.md)
+ [ALTER MATERIALIZED VIEW](r_ALTER_MATERIALIZED_VIEW.md)
+ [ALTER RLS POLICY](r_ALTER_RLS_POLICY.md)
+ [ALTER ROLE](r_ALTER_ROLE.md)
+ [ALTER PROCEDURE](r_ALTER_PROCEDURE.md)
+ [ALTER SCHEMA](r_ALTER_SCHEMA.md)
+ [ALTER SYSTEM](r_ALTER_SYSTEM.md)
+ [ALTER TABLE](r_ALTER_TABLE.md)
+ [ALTER TABLE APPEND](r_ALTER_TABLE_APPEND.md)
+ [ALTER TEMPLATE](r_ALTER_TEMPLATE.md)
+ [ALTER USER](r_ALTER_USER.md)
+ [ANALYZE](r_ANALYZE.md)
+ [ANALYZE COMPRESSION](r_ANALYZE_COMPRESSION.md)
+ [ATTACH MASKING POLICY](r_ATTACH_MASKING_POLICY.md)
+ [ATTACH RLS POLICY](r_ATTACH_RLS_POLICY.md)
+ [BEGIN](r_BEGIN.md)
+ [CALL](r_CALL_procedure.md)
+ [CANCEL](r_CANCEL.md)
+ [CLOSE](close.md)
+ [COMMENT](r_COMMENT.md)
+ [COMMIT](r_COMMIT.md)
+ [COPY](r_COPY.md)
+ [CREATE DATABASE](r_CREATE_DATABASE.md)
+ [CREATE DATASHARE](r_CREATE_DATASHARE.md)
+ [CREATE EXTERNAL FUNCTION](r_CREATE_EXTERNAL_FUNCTION.md)
+ [CREATE EXTERNAL MODEL](r_create_external_model.md)
+ [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md)
+ [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md)
+ [CREATE EXTERNAL VIEW](r_CREATE_EXTERNAL_VIEW.md)
+ [CREATE FUNCTION](r_CREATE_FUNCTION.md)
+ [CREATE GROUP](r_CREATE_GROUP.md)
+ [CREATE IDENTITY PROVIDER](r_CREATE_IDENTITY_PROVIDER.md)
+ [CREATE LIBRARY](r_CREATE_LIBRARY.md)
+ [CREATE MASKING POLICY](r_CREATE_MASKING_POLICY.md)
+ [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md)
+ [CREATE MODEL](r_CREATE_MODEL.md)
+ [CREATE PROCEDURE](r_CREATE_PROCEDURE.md)
+ [CREATE RLS POLICY](r_CREATE_RLS_POLICY.md)
+ [CREATE ROLE](r_CREATE_ROLE.md)
+ [CREATE SCHEMA](r_CREATE_SCHEMA.md)
+ [CREATE TABLE](r_CREATE_TABLE_NEW.md)
+ [CREATE TABLE AS](r_CREATE_TABLE_AS.md)
+ [CREATE TEMPLATE](r_CREATE_TEMPLATE.md)
+ [CREATE USER](r_CREATE_USER.md)
+ [CREATE VIEW](r_CREATE_VIEW.md)
+ [DEALLOCATE](r_DEALLOCATE.md)
+ [DECLARE](declare.md)
+ [DELETE](r_DELETE.md)
+ [DESC DATASHARE](r_DESC_DATASHARE.md)
+ [DESC IDENTITY PROVIDER](r_DESC_IDENTITY_PROVIDER.md)
+ [DETACH MASKING POLICY](r_DETACH_MASKING_POLICY.md)
+ [DETACH RLS POLICY](r_DETACH_RLS_POLICY.md)
+ [DROP DATABASE](r_DROP_DATABASE.md)
+ [DROP DATASHARE](r_DROP_DATASHARE.md)
+ [DROP EXTERNAL VIEW](r_DROP_EXTERNAL_VIEW.md)
+ [DROP FUNCTION](r_DROP_FUNCTION.md)
+ [DROP GROUP](r_DROP_GROUP.md)
+ [DROP IDENTITY PROVIDER](r_DROP_IDENTITY_PROVIDER.md)
+ [DROP LIBRARY](r_DROP_LIBRARY.md)
+ [DROP MASKING POLICY](r_DROP_MASKING_POLICY.md)
+ [DROP MODEL](r_DROP_MODEL.md)
+ [DROP MATERIALIZED VIEW](materialized-view-drop-sql-command.md)
+ [DROP PROCEDURE](r_DROP_PROCEDURE.md)
+ [DROP RLS POLICY](r_DROP_RLS_POLICY.md)
+ [DROP ROLE](r_DROP_ROLE.md)
+ [DROP SCHEMA](r_DROP_SCHEMA.md)
+ [DROP TABLE](r_DROP_TABLE.md)
+ [DROP TEMPLATE](r_DROP_TEMPLATE.md)
+ [DROP USER](r_DROP_USER.md)
+ [DROP VIEW](r_DROP_VIEW.md)
+ [END](r_END.md)
+ [EXECUTE](r_EXECUTE.md)
+ [EXPLAIN](r_EXPLAIN.md)
+ [FETCH](fetch.md)
+ [GRANT](r_GRANT.md)
+ [INSERT](r_INSERT_30.md)
+ [INSERT (external table)](r_INSERT_external_table.md)
+ [LOCK](r_LOCK.md)
+ [MERGE](r_MERGE.md)
+ [PREPARE](r_PREPARE.md)
+ [REFRESH MATERIALIZED VIEW](materialized-view-refresh-sql-command.md)
+ [RESET](r_RESET.md)
+ [REVOKE](r_REVOKE.md)
+ [ROLLBACK](r_ROLLBACK.md)
+ [SELECT](r_SELECT_synopsis.md)
+ [SELECT INTO](r_SELECT_INTO.md)
+ [SET](r_SET.md)
+ [SET SESSION AUTHORIZATION](r_SET_SESSION_AUTHORIZATION.md)
+ [SET SESSION CHARACTERISTICS](r_SET_SESSION_CHARACTERISTICS.md)
+ [SHOW](r_SHOW.md)
+ [SHOW COLUMN GRANTS](r_SHOW_COLUMN_GRANTS.md)
+ [SHOW COLUMNS](r_SHOW_COLUMNS.md)
+ [SHOW CONSTRAINTS](r_SHOW_CONSTRAINTS.md)
+ [SHOW EXTERNAL TABLE](r_SHOW_EXTERNAL_TABLE.md)
+ [SHOW DATABASES](r_SHOW_DATABASES.md)
+ [SHOW FUNCTIONS](r_SHOW_FUNCTIONS.md)
+ [SHOW GRANTS](r_SHOW_GRANTS.md)
+ [SHOW MODEL](r_SHOW_MODEL.md)
+ [SHOW DATASHARES](r_SHOW_DATASHARES.md)
+ [SHOW PARAMETERS](r_SHOW_PARAMETERS.md)
+ [SHOW POLICIES](r_SHOW_POLICIES.md)
+ [SHOW PROCEDURE](r_SHOW_PROCEDURE.md)
+ [SHOW PROCEDURES](r_SHOW_PROCEDURES.md)
+ [SHOW SCHEMAS](r_SHOW_SCHEMAS.md)
+ [SHOW TABLE](r_SHOW_TABLE.md)
+ [SHOW TABLES](r_SHOW_TABLES.md)
+ [SHOW TEMPLATE](r_SHOW_TEMPLATE.md)
+ [SHOW TEMPLATES](r_SHOW_TEMPLATES.md)
+ [SHOW VIEW](r_SHOW_VIEW.md)
+ [START TRANSACTION](r_START_TRANSACTION.md)
+ [TRUNCATE](r_TRUNCATE.md)
+ [UNLOAD](r_UNLOAD.md)
+ [UPDATE](r_UPDATE.md)
+ [USE](r_USE_command.md)
+ [VACUUM](r_VACUUM_command.md)

# ABORT
<a name="r_ABORT"></a>

Stops the currently running transaction and discards all updates made by that transaction. ABORT has no effect on already completed transactions.

This command performs the same function as the ROLLBACK command. For information, see [ROLLBACK](r_ROLLBACK.md).

## Syntax
<a name="r_ABORT-synopsis"></a>

```
ABORT [ WORK | TRANSACTION ]
```

## Parameters
<a name="r_ABORT-parameters"></a>

WORK  
Optional keyword.

TRANSACTION  
Optional keyword; WORK and TRANSACTION are synonyms.

## Example
<a name="r_ABORT-example"></a>

The following example creates a table then starts a transaction where data is inserted into the table. The ABORT command then rolls back the data insertion to leave the table empty.

The following command creates an example table called MOVIE\$1GROSS:

```
create table movie_gross( name varchar(30), gross bigint );
```

The next set of commands starts a transaction that inserts two data rows into the table:

```
begin;

insert into movie_gross values ( 'Raiders of the Lost Ark', 23400000);

insert into movie_gross values ( 'Star Wars', 10000000 );
```

Next, the following command selects the data from the table to show that it was successfully inserted:

```
select * from movie_gross;
```

The command output shows that both rows are successfully inserted:

```
         name           |  gross
------------------------+----------
Raiders of the Lost Ark | 23400000
Star Wars               | 10000000
(2 rows)
```

This command now rolls back the data changes to where the transaction began:

```
abort;
```

Selecting data from the table now shows an empty table:

```
select * from movie_gross;

 name | gross
------+-------
(0 rows)
```

# ALTER DATABASE
<a name="r_ALTER_DATABASE"></a>

Changes the attributes of a database.

## Required privileges
<a name="r_ALTER_DATABASE-privileges"></a>

To use ALTER DATABASE, one of the following privileges is required..
+ Superuser
+ Users with the ALTER DATABASE privilege
+ Database owner

## Syntax
<a name="r_ALTER_DATABASE-synopsis"></a>

```
ALTER DATABASE database_name
{ 
  RENAME TO new_name
  | OWNER TO new_owner
  | [ CONNECTION LIMIT { limit | UNLIMITED } ]
    [ COLLATE { CASE_SENSITIVE | CS | CASE_INSENSITIVE | CI } ]
    [ ISOLATION LEVEL { SNAPSHOT | SERIALIZABLE } ]
| INTEGRATION
 { 
  REFRESH { { ALL | INERROR } TABLES [ IN SCHEMA schema [, ...] ] | TABLE schema.table [, ...] }
   | SET 
     [ QUERY_ALL_STATES [=] { TRUE | FALSE } ] 
     [ ACCEPTINVCHARS [=] { TRUE | FALSE } ] 
     [ REFRESH_INTERVAL <interval> ]
     [ TRUNCATECOLUMNS [=] { TRUE | FALSE } ]
     [ HISTORY_MODE [=] {TRUE | FALSE} [ FOR { {ALL} TABLES [IN SCHEMA schema [, ...] ] | TABLE schema.table [, ...] } ] ]
 }
}
```

## Parameters
<a name="r_ALTER_DATABASE-parameters"></a>

 *database\$1name*   
Name of the database to alter. Typically, you alter a database that you are not currently connected to; in any case, the changes take effect only in subsequent sessions. You can change the owner of the current database, but you can't rename it:  

```
alter database tickit rename to newtickit;
ERROR:  current database may not be renamed
```

RENAME TO   
Renames the specified database. For more information about valid names, see [Names and identifiers](r_names.md). You can't rename the dev, padb\$1harvest, template0, template1, or sys:internal databases, and you can't rename the current database. Only the database owner or a [superuser](r_superusers.md#def_superusers) can rename a database; non-superuser owners must also have the CREATEDB privilege.

 *new\$1name*   
New database name.

OWNER TO   
Changes the owner of the specified database. You can change the owner of the current database or some other database. Only a superuser can change the owner.

 *new\$1owner*   
New database owner. The new owner must be an existing database user with write privileges. For more information about user privileges, see [GRANT](r_GRANT.md).

CONNECTION LIMIT \$1 *limit* \$1 UNLIMITED \$1   
The maximum number of database connections users are permitted to have open concurrently. The limit is not enforced for superusers. Use the UNLIMITED keyword to permit the maximum number of concurrent connections. A limit on the number of connections for each user might also apply. For more information, see [CREATE USER](r_CREATE_USER.md). The default is UNLIMITED. To view current connections, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view.  
If both user and database connection limits apply, an unused connection slot must be available that is within both limits when a user attempts to connect.

COLLATE \$1 CASE\$1SENSITIVE \$1 CS \$1 CASE\$1INSENSITIVE \$1 CI \$1  
A clause that specifies whether string search or comparison is case-sensitive or case-insensitive.   
You can change the case sensitivity of the current database even if it's empty.  
You must have ALTER permission for the current database to change case sensitivity. Superusers or database owners with the CREATE DATABASE permission can also change database case sensitivity.  
CASE\$1SENSITIVE and CS are interchangeable and yield the same results. Similarly, CASE\$1INSENSITIVE and CI are interchangeable and yield the same results.

ISOLATION LEVEL \$1 SNAPSHOT \$1 SERIALIZABLE \$1  
A clause that specifies the isolation level used when queries run against a database. For more information on isolation levels, see [Isolation levels in Amazon Redshift](c_serial_isolation.md).  
+ SNAPSHOT isolation – provides an isolation level with protection against update and delete conflicts. 
+ SERIALIZABLE isolation – provides full serializability for concurrent transactions.
Consider the following items when altering the isolation level of a database:  
+ You must have the superuser or CREATE DATABASE privilege to the current database to change the database isolation level.
+ You can't alter the isolation level of the `dev` database. 
+ You can't alter the isolation level within a transaction block.
+ The alter isolation level command fails if other users are connected to the database.
+ The alter isolation level command can alter the isolation level settings of the current session.

INTEGRATION  
Alter a zero-ETL integration database.

REFRESH \$1\$1 ALL \$1 INERROR \$1 TABLES [IN SCHEMA *schema* [, ...]] \$1 TABLE *schema.table* [, ...]\$1  
A clause that specifies whether Amazon Redshift will refresh all tables or tables with errors in the specified schema or table. The refresh will trigger the tables in the specified schema or table to be fully replicated from the source database.  
For more information, see [Zero-ETL integrations](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.html) in the *Amazon Redshift Management Guide*. For more information about integration states, see [SVV\$1INTEGRATION\$1TABLE\$1STATE](r_SVV_INTEGRATION_TABLE_STATE.md) and [SVV\$1INTEGRATION](r_SVV_INTEGRATION.md).

QUERY\$1ALL\$1STATES [=] \$1 TRUE \$1 FALSE \$1  
The QUERY\$1ALL\$1STATES clause sets whether zero-ETL integration tables can be queried in all states (`Synced`, `Failed`, `ResyncRequired`, and `ResyncInitiated`). By default, a zero-ETL integration table can only be queried in `Synced` state.

ACCEPTINVCHARS [=] \$1 TRUE \$1 FALSE \$1  
The ACCEPTINVCHARS clause sets whether zero-ETL integration tables continue with ingestion when invalid characters are detected for the VARCHAR data type. When invalid characters are encountered, the invalid character is replaced with a default `?` character.

REFRESH\$1INTERVAL <interval>  
The REFRESH\$1INTERVAL clause sets the approximate time interval, in seconds, to refresh data from the zero-ETL source to the target database. The `interval` can be set 0–432,000 seconds (5 days) for zero-ETL integrations whose source type is Aurora MySQL, Aurora PostgreSQL, or RDS for MySQL. For Amazon DynamoDB zero-ETL integrations, the `interval` can be set 900–432,000 seconds (15 minutes –5 days).  
For more information about creating databases with zero-ETL integrations, see [Creating destination databases in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.creating-db.html) in the *Amazon Redshift Management Guide*.

TRUNCATECOLUMNS [=] \$1 TRUE \$1 FALSE \$1  
The TRUNCATECOLUMNS clause sets whether zero-ETL integration tables continue with ingestion when the values for the VARCHAR column or SUPER column attributes are beyond the limit. When `TRUE`, the values are truncated to fit into the column and the values of overflowing JSON attributes are truncated to fit into the SUPER column.

HISTORY\$1MODE [=] \$1TRUE \$1 FALSE\$1 [ FOR \$1 \$1ALL\$1 TABLES [IN SCHEMA schema [, ...]] \$1 TABLE schema.table [, ...]\$1 ]  
A clause that specifies whether Amazon Redshift will set history mode for all tables or tables in the specified schema that participate in zero-ETL integration. This option is only applicable for databases created for zero-ETL integration.  
The HISTORY\$1MODE clause can be set to `TRUE` or `FALSE`. The default is `FALSE`. Switching history mode on and off is only applicable to tables that are in the `Synced` state. For information about HISTORY\$1MODE, see [History mode](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-history-mode.html) in the *Amazon Redshift Management Guide*.

## Usage notes
<a name="r_ALTER_DATABASE-usage-notes"></a>

ALTER DATABASE commands apply to subsequent sessions not current sessions. You must reconnect to the altered database to see the effect of the change.

## Examples
<a name="r_ALTER_DATABASE-examples"></a>

The following example renames a database named TICKIT\$1SANDBOX to TICKIT\$1TEST: 

```
alter database tickit_sandbox rename to tickit_test;
```

The following example changes the owner of the TICKIT database (the current database) to DWUSER: 

```
alter database tickit owner to dwuser;
```

The following example changes the database case sensitivity of the sampledb database:

```
ALTER DATABASE sampledb COLLATE CASE_INSENSITIVE;
```

The following example alters a database named **sampledb** with SNAPSHOT isolation level.

```
ALTER DATABASE sampledb ISOLATION LEVEL SNAPSHOT;
```

The following example refreshes the tables **schema1.sample\$1table1** and **schema2.sample\$1table2** in the database **sample\$1integration\$1db** in your zero-ETL integration.

```
ALTER DATABASE sample_integration_db INTEGRATION REFRESH TABLE schema1.sample_table1, schema2.sample_table2;
```

The following example refreshes all synced and failed tables within your zero-ETL integration.

```
ALTER DATABASE sample_integration_db INTEGRATION REFRESH ALL tables;
```

The following example sets the refresh interval for zero-ETL integrations to 600 seconds..

```
ALTER DATABASE sample_integration_db INTEGRATION SET REFRESH_INTERVAL 600;
```

The following example refreshes all tables that are in the `ErrorState` in the schema **sample\$1schema**.

```
ALTER DATABASE sample_integration_db INTEGRATION REFRESH INERROR TABLES in SCHEMA sample_schema;
```

The following example switches history mode on for table `myschema.table1`.

```
ALTER DATABASE sample_integration_db INTEGRATION SET HISTORY_MODE = true FOR TABLE myschema.table1
```

The following example switches history mode on for all tables in `myschema`. 

```
ALTER DATABASE sample_integration_db INTEGRATION SET HISTORY_MODE = true for ALL TABLES IN SCHEMA myschema
```

# ALTER DATASHARE
<a name="r_ALTER_DATASHARE"></a>

Changes the definition of a datashare. You can add objects or remove objects using ALTER DATASHARE. You can only change a datashare in the current database. Add or remove objects from the associated database to a datashare. The owner of the datashare with the required permissions on the datashare objects to be added or removed can alter the datashare.

## Required privileges
<a name="r_ALTER_DATASHARE-privileges"></a>

Following are required privileges for ALTER DATASHARE:
+ Superuser.
+ User with the ALTER DATASHARE privilege.
+ Users who have the ALTER or ALL privilege on the datashare.
+ To add specific objects to a datashare, users must have the privilege on the objects. For this case, users should be the owners of objects or have SELECT, USAGE, or ALL privileges on the objects. 

## Syntax
<a name="r_ALTER_DATASHARE-synopsis"></a>

The following syntax illustrates how to add or remove objects to the datashare.

```
ALTER DATASHARE datashare_name { ADD | REMOVE } {
TABLE schema.table [, ...]
| SCHEMA schema [, ...]
| FUNCTION schema.sql_udf (argtype,...) [, ...]
| ALL TABLES IN SCHEMA schema [, ...]
| ALL FUNCTIONS IN SCHEMA schema [, ...] }
```

The following syntax illustrates how to configure the properties of the datashare.

```
ALTER DATASHARE datashare_name {
[ SET PUBLICACCESSIBLE [=] TRUE | FALSE ]
[ SET INCLUDENEW [=] TRUE | FALSE FOR SCHEMA schema ] }
```

## Parameters
<a name="r_ALTER_DATASHARE-parameters"></a>

*datashare\$1name*  
The name of the datashare to be altered. 

ADD \$1 REMOVE  
A clause that specifies whether to add objects to or remove objects from the datashare.

TABLE *schema*.*table* [, ...]  
The name of the table or view in the specified schema to add to the datashare.

SCHEMA *schema* [, ...]   
The name of the schema to add to the datashare.

FUNCTION *schema*.*sql\$1udf* (argtype,...) [, ...]  
The name of the user-defined SQL function with argument types to add to the datashare.

ALL TABLES IN SCHEMA *schema* [, ...]   
A clause that specifies whether to add all tables and views in the specified schema to the datashare.

ALL FUNCTIONS IN SCHEMA *schema* [, ...] \$1  
A clause that specifies adding all functions in the specified schema to the datashare.

[ SET PUBLICACCESSIBLE [=] TRUE \$1 FALSE ]  
A clause that specifies whether a datashare can be shared to clusters that are publicly accessible.

[ SET INCLUDENEW [=] TRUE \$1 FALSE FOR SCHEMA *schema* ]  
A clause that specifies whether to add any future tables, views, or SQL user-defined functions (UDFs) created in the specified schema to the datashare. Current tables, views, or SQL UDFs in the specified schema aren't added to the datashare. Only superusers can change this property for each datashare-schema pair. By default, the INCLUDENEW clause is false. 

## ALTER DATASHARE usage notes
<a name="r_ALTER_DATASHARE_usage"></a>
+ The following users can alter a datashare:
  + A superuser
  + The owner of the datashare
  + Users that have ALTER or ALL privilege on the datashare
+ To add specific objects to a datashare, users must have the correct privileges on the objects. Users should be the owners of objects or have SELECT, USAGE, or ALL privileges on the objects.
+ You can share schemas, tables, regular views, late-binding views, materialized views, and SQL user-defined functions (UDFs). Add a schema to a datashare first before adding objects in the schema. 

  When you add a schema, Amazon Redshift doesn't add all the objects under it. You must add them explicitly. 
+ We recommend that you create AWS Data Exchange datashares with the publicly accessible setting turned on.
+ In general, we recommend that you don't alter an AWS Data Exchange datashare to turn off public accessibility using the ALTER DATASHARE statement. If you do, the AWS accounts that have access to the datashare lose access if their clusters are publicly accessible. Performing this type of alteration can breach data product terms in AWS Data Exchange. For an exception to this recommendation, see following.

  The following example shows an error when an AWS Data Exchange datashare is created with the setting turned off.

  ```
  ALTER DATASHARE salesshare SET PUBLICACCESSIBLE FALSE;
  ERROR:  Alter of ADX-managed datashare salesshare requires session variable datashare_break_glass_session_var to be set to value 'c670ba4db22f4b'
  ```

  To allow altering an AWS Data Exchange datashare to turn off the publicly accessible setting, set the following variable and run the ALTER DATASHARE statement again.

  ```
  SET datashare_break_glass_session_var to 'c670ba4db22f4b';
  ```

  ```
  ALTER DATASHARE salesshare SET PUBLICACCESSIBLE FALSE;
  ```

  In this case, Amazon Redshift generates a random one-time value to set the session variable to allow ALTER DATASHARE SET PUBLICACCESSIBLE FALSE for an AWS Data Exchange datashare.

## Examples
<a name="r_ALTER_DATASHARE_examples"></a>

The following example adds the `public` schema to the datashare `salesshare`.

```
ALTER DATASHARE salesshare ADD SCHEMA public;
```

The following example adds the `public.tickit_sales_redshift` table to the datashare `salesshare`.

```
ALTER DATASHARE salesshare ADD TABLE public.tickit_sales_redshift;
```

The following example adds all tables to the datashare `salesshare`.

```
ALTER DATASHARE salesshare ADD ALL TABLES IN SCHEMA PUBLIC;
```

The following example removes the `public.tickit_sales_redshift` table from the datashare `salesshare`.

```
ALTER DATASHARE salesshare REMOVE TABLE public.tickit_sales_redshift;
```

# ALTER DEFAULT PRIVILEGES
<a name="r_ALTER_DEFAULT_PRIVILEGES"></a>

Defines the default set of access permissions to be applied to objects that are created in the future by the specified user. By default, users can change only their own default access permissions. Only a superuser can specify default permissions for other users.

You can apply default privileges to roles, users, or user groups. You can set default permissions globally for all objects created in the current database, or for objects created only in the specified schemas. 

Default permissions apply only to new objects. Running ALTER DEFAULT PRIVILEGES doesn’t change permissions on existing objects. To grant permissions on all current and future objects created by any user within a database or schema, see [Scoped permissions](https://docs.aws.amazon.com/redshift/latest/dg/t_scoped-permissions.html). 

To view information about the default privileges for database users, query the [PG\$1DEFAULT\$1ACL](r_PG_DEFAULT_ACL.md) system catalog table. 

For more information about privileges, see [GRANT](r_GRANT.md).

## Required privileges
<a name="r_ALTER_DEFAULT_PRIVILEGES-privileges"></a>

Following are required privileges for ALTER DEFAULT PRIVILEGES:
+ Superuser
+ Users with the ALTER DEFAULT PRIVILEGES privilege
+ Users changing their own default access privileges
+ Users setting privileges for schemas that they have access privileges to

## Syntax
<a name="r_ALTER_DEFAULT_PRIVILEGES-synopsis"></a>

```
ALTER DEFAULT PRIVILEGES
    [ FOR USER target_user [, ...] ]
    [ IN SCHEMA schema_name [, ...] ]
    grant_or_revoke_clause

where grant_or_revoke_clause is one of:

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES | TRUNCATE } [,...] | ALL [ PRIVILEGES ] }
	ON TABLES
	TO { user_name [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
	ON FUNCTIONS
	TO { user_name [ WITH GRANT OPTION ] |  ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
	ON PROCEDURES
	TO { user_name [ WITH GRANT OPTION ] |  ROLE role_name | GROUP group_name | PUBLIC } [, ...]

REVOKE [ GRANT OPTION FOR ] { { SELECT | INSERT | UPDATE | DELETE | REFERENCES | TRUNCATE } [,...] | ALL [ PRIVILEGES ] }
	ON TABLES
	FROM user_name [, ...] [ RESTRICT ]

REVOKE  { { SELECT | INSERT | UPDATE | DELETE | REFERENCES | TRUNCATE } [,...] | ALL [ PRIVILEGES ] }
	ON TABLES
	FROM { ROLE role_name | GROUP group_name | PUBLIC } [, ...] [ RESTRICT ]

REVOKE [ GRANT OPTION FOR ] { EXECUTE | ALL [ PRIVILEGES ] }
	ON FUNCTIONS
	FROM user_name [, ...] [ RESTRICT ]

REVOKE { EXECUTE | ALL [ PRIVILEGES ] }
	ON FUNCTIONS
	FROM { ROLE role_name | GROUP group_name | PUBLIC } [, ...] [ RESTRICT ]

REVOKE [ GRANT OPTION FOR ] { EXECUTE | ALL [ PRIVILEGES ] }
	ON PROCEDURES
	FROM user_name [, ...] [ RESTRICT ]

REVOKE { EXECUTE | ALL [ PRIVILEGES ] }
	ON PROCEDURES
	FROM { ROLE role_name | GROUP group_name | PUBLIC } [, ...] [ RESTRICT ]
```

## Parameters
<a name="r_ALTER_DEFAULT_PRIVILEGES-parameters"></a>

FOR USER *target\$1user*  <a name="default-for-user"></a>
Optional. The name of the user for which default privileges are defined. Only a superuser can specify default privileges for other users. The default value is the current user.

IN SCHEMA *schema\$1name*   <a name="default-in-schema"></a>
Optional. If an IN SCHEMA clause appears, the specified default privileges are applied to new objects created in the specified *schema\$1name*. In this case, the user or user group that is the target of ALTER DEFAULT PRIVILEGES must have CREATE privilege for the specified schema. Default privileges that are specific to a schema are added to existing global default privileges. By default, default privileges are applied globally to the entire database. 

GRANT   <a name="default-grant"></a>
The set of privileges to grant to the specified users or groups for all new tables and views, functions, or stored procedures created by the specified user. You can set the same privileges and options with the GRANT clause that you can with the [GRANT](r_GRANT.md) command. 

WITH GRANT OPTION   <a name="default-grant-option"></a>
A clause that indicates that the user receiving the privileges can in turn grant the same privileges to others. You can't grant WITH GRANT OPTION to a group or to PUBLIC. 

TO *user\$1name* \$1 ROLE *role\$1name* \$1 GROUP *group\$1name*   <a name="default-to"></a>
The name of the user, role, or user group to which the specified default privileges are applied.

REVOKE   <a name="default-revoke"></a>
The set of privileges to revoke from the specified users or groups for all new tables, functions, or stored procedures created by the specified user. You can set the same privileges and options with the REVOKE clause that you can with the [REVOKE](r_REVOKE.md) command. 

GRANT OPTION FOR  <a name="default-revoke-option"></a>
 A clause that revokes only the option to grant a specified privilege to other users and doesn't revoke the privilege itself. You can't revoke GRANT OPTION from a group or from PUBLIC. 

FROM *user\$1name* \$1 ROLE *role\$1name* \$1 GROUP *group\$1name*  <a name="default-from"></a>
The name of the user, role, or user group from which the specified privileges are revoked by default.

RESTRICT   <a name="default-restrict"></a>
The RESTRICT option revokes only those privileges that the user directly granted. This is the default.

## Examples
<a name="r_ALTER_DEFAULT_PRIVILEGES-examples"></a>

Suppose that you want to allow any user in the user group `report_readers` to view all tables and views created by the user `report_admin`. In this case, run the following command as a superuser. 

```
alter default privileges for user report_admin grant select on tables to group report_readers; 
```

In the following example, the first command grants SELECT privileges on all new tables that you create. Anytime you create a new view, you must explicitly grant privileges to the view or re-run the `alter default privileges` command again.

```
alter default privileges grant select on tables to public; 
```

The following example grants INSERT privilege to the `sales_admin` user group for all new tables and views that you create in the `sales` schema. 

```
alter default privileges in schema sales grant insert on tables to group sales_admin; 
```

The following example reverses the ALTER DEFAULT PRIVILEGES command in the preceding example. 

```
alter default privileges in schema sales revoke insert on tables from group sales_admin;
```

By default, the PUBLIC user group has execute permission for all new user-defined functions. To revoke `public` execute permissions for your new functions and then grant execute permission only to the `dev_test` user group, run the following commands. 

```
alter default privileges revoke execute on functions from public;
alter default privileges grant execute on functions to group dev_test;
```

# ALTER EXTERNAL SCHEMA
<a name="r_ALTER_EXTERNAL_SCHEMA"></a>

Alters an existing external schema in the current database. Only schema owners, super-users, or users with ALTER privileges on the schema can alter it. Only external schemas created from DATA CATALOG, KAFKA, or MSK can be altered.

The owner of this schema is the issuer of the CREATE EXTERNAL SCHEMA command. To transfer ownership of an external schema, use ALTER SCHEMA to change the owner. To grant access to the schema to other users or user groups, use the GRANT command.

You can't use the GRANT or REVOKE commands for permissions on an external table. Instead, grant or revoke the permissions on the external schema. 

For more information, see the following:
+ [ALTER SCHEMA](r_ALTER_SCHEMA.md)
+ [GRANT](r_GRANT.md)
+ [REVOKE](r_REVOKE.md)
+ [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md)
+ [Enabling mTLS authentication for an existing external schema](materialized-view-streaming-ingestion-mtls.md#materialized-view-streaming-ingestion-mtls-alter)

To view details for external schemas, query the SVV\$1EXTERNAL\$1SCHEMAS system view. For more information, see [SVV\$1EXTERNAL\$1SCHEMAS](r_SVV_EXTERNAL_SCHEMAS.md).

## Syntax
<a name="r_ALTER_EXTERNAL_SCHEMA-synopsis"></a>

```
ALTER EXTERNAL SCHEMA schema_name
[ IAM_ROLE [ default | 'SESSION' | 'arn:aws:iam::<account-id>:role/<role-name>' ] ]
[ AUTHENTICATION [ none | iam | mtls] ]
[ AUTHENTICATION_ARN 'acm-certificate-arn' | SECRET_ARN 'asm-secret-arn' ]
[ URI 'Kafka bootstrap URL' ]
```

If you have an existing external schema that you use for streaming ingestion and you want to implement mutual TLS for authentication, you can run a command such as the following, which specifies mTLS authentication and the ACM certificate ARN in ACM. 

```
ALTER EXTERNAL SCHEMA schema_name 
AUTHENTICATION mtls
AUTHENTICATION_ARN 'arn:aws:acm:us-east-1:444455556666:certificate/certificate_ID';
```

Or you can modify mTLS authentication, with reference to the secret ARN in Secrets Manager.

```
ALTER EXTERNAL SCHEMA schema_name 
AUTHENTICATION mtls
SECRET_ARN 'arn:aws:secretsmanager:us-east-1:012345678910:secret:myMTLSSecret';
```

The following example shows how to modify the URI for ALTER EXTERNAL SCHEMA:

```
ALTER EXTERNAL SCHEMA schema_name  
URI 'lkc-ghidef-67890.centralus.azure.glb.confluent.cloud:9092';
```

The following example shows how to modify the IAM role for ALTER EXTERNAL SCHEMA:

```
ALTER EXTERNAL SCHEMA schema_name  
IAM_ROLE 'arn:aws:iam::012345678901:role/testrole';
```

## Parameters
<a name="r_ALTER_EXTERNAL_SCHEMA-parameters"></a>

 IAM\$1ROLE[ default \$1 'SESSION' \$1 'arn:aws:iam::<AWS account-id>:role/<role-name>' ]   
Use the `default` keyword to have Amazon Redshift use the IAM role that is set as default.  
Use `'SESSION'` if you connect to your Amazon Redshift cluster using a federated identity and access the tables from the external schema created using this command.  
See [CREATE EXTERNAL SCHEMA](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html) for more information.

AUTHENTICATION  
The authentication type defined for streaming ingestion. Streaming ingestion with authentication types works with Apache Kafka, Confluent Cloud, and Amazon Managed Streaming for Apache Kafka. See [ CREATE EXTERNAL SCHEMA](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html) for more information.

AUTHENTICATION\$1ARN  
The ARN of the AWS Certificate Manager certificate used by Amazon Redshift for mtls authentication with Apache Kafka, Confluent Cloud, or Amazon Managed Streaming for Apache Kafka (Amazon MSK). The ARN is available in the ACM console when you choose the issued certificate.

SECRET\$1ARN  
The Amazon Resource Name (ARN) of a supported secret created using AWS Secrets Manager. For information about how to create and retrieve an ARN for a secret, see [Manage secrets with AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/manage_create-basic-secret.html) in the *AWS Secrets Manager User Guide*, and [Retrieving the Amazon Resource Name (ARN) of the secret in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-secrets-manager-integration-retrieving-secret.html).

URI  
The bootstrap URL of the Apache Kafka, Confluent Cloud or Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster. The endpoint must be reachable (routable) from the Amazon Redshift cluster. See [CREATE EXTERNAL SCHEMA](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html) for more information.

# ALTER EXTERNAL VIEW
<a name="r_ALTER_EXTERNAL_VIEW"></a>

Use the ALTER EXTERNAL VIEW command to update your external view. Depending on which parameters you use, other SQL engines such as Amazon Athena and Amazon EMR Spark that can also reference this view might be affected. For more information about Data Catalog views, see [AWS Glue Data Catalog views](https://docs.aws.amazon.com/redshift/latest/dg/data-catalog-views-overview.html).

## Syntax
<a name="r_ALTER_EXTERNAL_VIEW-synopsis"></a>

```
ALTER EXTERNAL VIEW schema_name.view_name
{catalog_name.schema_name.view_name | awsdatacatalog.dbname.view_name | external_schema_name.view_name}
[FORCE] { AS (query_definition) | REMOVE DEFINITION }
```

## Parameters
<a name="r_ALTER_EXTERNAL_VIEW-parameters"></a>

 *schema\$1name.view\$1name*   
The schema that’s attached to your AWS Glue database, followed by the name of the view.

catalog\$1name.schema\$1name.view\$1name \$1 awsdatacatalog.dbname.view\$1name \$1 external\$1schema\$1name.view\$1name  
The notation of the schema to use when altering the view. You can specify to use the AWS Glue Data Catalog, a Glue database that you created, or an external schema that you created. See [CREATE DATABASE](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_DATABASE.html) and [CREATE EXTERNAL SCHEMA ](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html)for more information.

FORCE  
Whether AWS Lake Formation should update the definition of the view even if the objects referenced in the table are inconsistent with other SQL engines. If Lake Formation updates the view, the view is considered stale for the other SQL engines until those engines are updated as well.

 *AS query\$1definition*   
The definition of the SQL query that Amazon Redshift runs to alter the view.

REMOVE DEFINITION  
Whether to drop and recreate the views. Views must be dropped and recreated to mark them as `PROTECTED`.

## Examples
<a name="r_ALTER_EXTERNAL_VIEW-examples"></a>

The following example alters a Data Catalog view named sample\$1schema.glue\$1data\$1catalog\$1view.

```
ALTER EXTERNAL VIEW sample_schema.glue_data_catalog_view
FORCE
REMOVE DEFINITION
```

# ALTER FUNCTION
<a name="r_ALTER_FUNCTION"></a>

Renames a function or changes the owner. Both the function name and data types are required. Only the owner or a superuser can rename a function. Only a superuser can change the owner of a function. 

## Syntax
<a name="r_ALTER_FUNCTION-synopsis"></a>

```
ALTER FUNCTION function_name ( { [ py_arg_name py_arg_data_type | sql_arg_data_type } [ , ... ] ] )
     RENAME TO new_name
```

```
ALTER FUNCTION function_name ( { [ py_arg_name py_arg_data_type | sql_arg_data_type } [ , ... ] ] )
     OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
```

## Parameters
<a name="r_ALTER_FUNCTION-parameters"></a>

 *function\$1name*   
The name of the function to be altered. Either specify the name of the function in the current search path, or use the format `schema_name.function_name` to use a specific schema.

*py\$1arg\$1name py\$1arg\$1data\$1type \$1 sql\$1arg\$1data\$1type*   
Optional. A list of input argument names and data types for the Python user-defined function, or a list of input argument data types for the SQL user-defined function.

 *new\$1name*   
A new name for the user-defined function. 

*new\$1owner* \$1 CURRENT\$1USER \$1 SESSION\$1USER  
A new owner for the user-defined function. 

## Examples
<a name="r_ALTER_FUNCTION-examples"></a>

The following example changes the name of a function from `first_quarter_revenue` to `quarterly_revenue`.

```
ALTER FUNCTION first_quarter_revenue(bigint, numeric, int) 
         RENAME TO quarterly_revenue;
```

The following example changes the owner of the `quarterly_revenue` function to `etl_user`.

```
ALTER FUNCTION quarterly_revenue(bigint, numeric) OWNER TO etl_user;
```

# ALTER GROUP
<a name="r_ALTER_GROUP"></a>

Changes a user group. Use this command to add users to the group, drop users from the group, or rename the group. 

## Syntax
<a name="r_ALTER_GROUP-synopsis"></a>

```
ALTER GROUP group_name
{
ADD USER username [, ... ] |
DROP USER username [, ... ] |
RENAME TO new_name
}
```

## Parameters
<a name="r_ALTER_GROUP-parameters"></a>

 *group\$1name*   
Name of the user group to modify. 

ADD   
Adds a user to a user group. 

DROP   
Removes a user from a user group. 

 *username*   
Name of the user to add to the group or drop from the group. 

RENAME TO   
Renames the user group. Group names beginning with two underscores are reserved for Amazon Redshift internal use. For more information about valid names, see [Names and identifiers](r_names.md). 

 *new\$1name*   
New name of the user group. 

## Examples
<a name="r_ALTER_GROUP-examples"></a>

The following example adds a user named DWUSER to the ADMIN\$1GROUP group.

```
ALTER GROUP admin_group
ADD USER dwuser;
```

The following example renames the group ADMIN\$1GROUP to ADMINISTRATORS. 

```
ALTER GROUP admin_group
RENAME TO administrators;
```

The following example adds two users to the group ADMIN\$1GROUP. 

```
ALTER GROUP admin_group
ADD USER u1, u2;
```

The following example drops two users from the group ADMIN\$1GROUP. 

```
ALTER GROUP admin_group
DROP USER u1, u2;
```

# ALTER IDENTITY PROVIDER
<a name="r_ALTER_IDENTITY_PROVIDER"></a>

Alters an identity provider to assign new parameters and values. When you run this command, all previously set parameter values are deleted before the new values are assigned. Only a superuser can alter an identity provider.

## Syntax
<a name="r_ALTER_IDENTITY_PROVIDER-synopsis"></a>

```
ALTER IDENTITY PROVIDER identity_provider_name
[PARAMETERS parameter_string]
[NAMESPACE namespace]
[IAM_ROLE iam_role]
[AUTO_CREATE_ROLES
    [ TRUE [ { INCLUDE | EXCLUDE } GROUPS LIKE filter_pattern] |
      FALSE
    ]
[DISABLE | ENABLE]
```

## Parameters
<a name="r_ALTER_IDENTITY_PROVIDER-parameters"></a>

 *identity\$1provider\$1name*   
Name of the new identity provider. For more information about valid names, see [Names and identifiers](r_names.md).

 *parameter\$1string*   
A string containing a properly formatted JSON object that contains parameters and values required for the specific identity provider.

 *namespace*   
The organization namespace.

 *iam\$1role*   
The IAM role that provides permissions for the connection to IAM Identity Center. This parameter is applicable only when the identity-provider type is AWSIDC.

 *auto\$1create\$1roles*   
Enables or disables the auto-create role feature. If the value is TRUE, Amazon Redshift enables the auto-create role feature. If the value is FALSE, Amazon Redshift disables the auto-create role feature. If the value for this parameter isn't specified, Amazon Redshift determines the value using the following logic:   
+  If `AUTO_CREATE_ROLES` is provided but the value isn't specified, the value is set to TRUE. 
+  If `AUTO_CREATE_ROLES` isn't provided and the identity provider is AWSIDC, the value is set to FALSE. 
+  If `AUTO_CREATE_ROLES` isn't provided and the identity provider is Azure, the value is set to TRUE. 
To include groups, specify `INCLUDE`. The default is empty, which means include all groups if `AUTO_CREATE_ROLES` is on.  
To exclude groups, specify `EXCLUDE`. The default is empty, which means do not exclude any groups if `AUTO_CREATE_ROLES` is on.

 *filter\$1pattern*   
A valid UTF-8 character expression with a pattern to match group names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_IDENTITY_PROVIDER.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator.   
*filter\$1pattern* supports the following characters:  
+  Uppercase and lowercase alphabetic characters (A-Z and a-z) 
+  Numerals (0-9) 
+  The following special characters: 

  ```
  _ % ^ * + ? { } , $
  ```

 *DISABLE or ENABLE*   
Turns an identity provider on or off. The default is ENABLE.

## Examples
<a name="r_ALTER_IDENTITY_PROVIDER-examples"></a>

The following example alters an identity provider named *oauth\$1standard*. It applies specifically to when Microsoft Azure AD is the identity provider.

```
ALTER IDENTITY PROVIDER oauth_standard
PARAMETERS '{"issuer":"https://sts.windows.net/2sdfdsf-d475-420d-b5ac-667adad7c702/",
"client_id":"87f4aa26-78b7-410e-bf29-57b39929ef9a",
"client_secret":"BUAH~ewrqewrqwerUUY^%tHe1oNZShoiU7",
"audience":["https://analysis.windows.net/powerbi/connector/AmazonRedshift"]
}'
```

The following sample shows how to set the identity-provider namespace. This can apply to Microsoft Azure AD, if it follows a statement like the previous sample, or to another identity provider. It can also apply to a case where you connect an existing Amazon Redshift provisioned cluster or Amazon Redshift Serverless workgroup to IAM Identity Center, if you have a connection set up through a managed application.

```
ALTER IDENTITY PROVIDER "my-redshift-idc-application"
NAMESPACE 'MYCO';
```

The following sample sets the IAM role and works in the use case for configuring Redshift integration with IAM Identity Center.

```
ALTER IDENTITY PROVIDER "my-redshift-idc-application"
IAM_ROLE 'arn:aws:iam::123456789012:role/myadministratorrole';
```

For more information about setting up a connection to IAM Identity Center from Redshift, see [Connect Redshift with IAM Identity Center to give users a single sign-on experience](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-idp-connect.html).

**Disabling an identity provider**

The following sample statement shows how to disable an identity provider. When it's disabled, federated users from the identity provider can't login to the cluster until it's enabled again.

```
ALTER IDENTITY PROVIDER "redshift-idc-app" DISABLE;
```

# ALTER MASKING POLICY
<a name="r_ALTER_MASKING_POLICY"></a>

Alters an existing dynamic data masking policy. For more information on dynamic data masking, see [Dynamic data masking](t_ddm.md).

Superusers and users or roles that have the sys:secadmin role can alter a masking policy.

## Syntax
<a name="r_ALTER_MASKING_POLICY-synopsis"></a>

```
ALTER MASKING POLICY
{ policy_name | database_name.policy_name }
USING (masking_expression);
```

## Parameters
<a name="r_ALTER_MASKING_POLICY-parameters"></a>

*policy\$1name*   
 The name of the masking policy. This must be the name of a masking policy that already exists in the database. 

database\$1name  
The name of the database from where the policy is created. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

*masking\$1expression*  
The SQL expression used to transform the target columns. It can be written using data manipulation functions such as String manipulation functions, or in conjunction with user-defined functions written in SQL, Python, or with AWS Lambda.   
 The expression must match the original expression's input columns and data types. For example, if the original masking policy's input columns were `sample_1 FLOAT` and `sample_2 VARCHAR(10)`, you wouldn't be able to alter the masking policy to take a third column, or make the policy take a FLOAT and a BOOLEAN. If you use a constant as your masking expression, you must explicitly cast it to a type that matches the input type.  
 You must have the USAGE permission on any user-defined functions that you use in the masking expression. 

For the usage of ALTER MASKING POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

# ALTER MATERIALIZED VIEW
<a name="r_ALTER_MATERIALIZED_VIEW"></a>

Changes the attributes of a materialized view. 

## Syntax
<a name="r_ALTER_MATERIALIZED_VIEW-synopsis"></a>

```
ALTER MATERIALIZED VIEW mv_name
{
AUTO REFRESH { YES | NO } 
| ALTER DISTKEY column_name
| ALTER DISTSTYLE ALL
| ALTER DISTSTYLE EVEN
| ALTER DISTSTYLE KEY DISTKEY column_name
| ALTER DISTSTYLE AUTO
| ALTER [COMPOUND] SORTKEY ( column_name [,...] )
| ALTER SORTKEY AUTO
| ALTER SORTKEY NONE
| ROW LEVEL SECURITY { ON | OFF } [ CONJUNCTION TYPE { AND | OR } ] [FOR DATASHARES]
};
```

## Parameters
<a name="r_ALTER_MATERIALIZED_VIEW-parameters"></a>

*mv\$1name*  
The name of the materialized view to alter.

AUTO REFRESH \$1 YES \$1 NO \$1  
A clause that turns on or off automatic refreshing of a materialized view. For more information about automatic refresh of materialized views, see [Refreshing a materialized view](materialized-view-refresh.md).

ALTER DISTSTYLE ALL  
A clause that changes the existing distribution style of a relation to `ALL`. Consider the following:  
+ An ALTER DISTSTYLE, ALTER SORTKEY, and VACUUM can't run concurrently on the same relation. 
  + If VACUUM is currently running, then running ALTER DISTSTYLE ALL returns an error. 
  + If ALTER DISTSTYLE ALL is running, then a background vacuum doesn't start on a relation. 
+ The ALTER DISTSTYLE ALL command is not supported for relations with interleaved sort keys and temporary tables.
+ If the distribution style was previously defined as AUTO, then the relation is no longer a candidate for automatic table optimization. 
For more information about DISTSTYLE ALL, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).

ALTER DISTSTYLE EVEN  
A clause that changes the existing distribution style of a relation to `EVEN`. Consider the following:  
+ An ALTER DISTSYTLE, ALTER SORTKEY, and VACUUM can't run concurrently on the same relation. 
  + If VACUUM is currently running, then running ALTER DISTSTYLE EVEN returns an error. 
  + If ALTER DISTSTYLE EVEN is running, then a background vacuum doesn't start on a relation. 
+ The ALTER DISTSTYLE EVEN command is not supported for relations with interleaved sort keys and temporary tables.
+ If the distribution style was previously defined as AUTO, then the relation is no longer a candidate for automatic table optimization. 
For more information about DISTSTYLE EVEN, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).

ALTER DISTKEY *column\$1name* or ALTER DISTSTYLE KEY DISTKEY *column\$1name*  
A clause that changes the column used as the distribution key of a relation. Consider the following:  
+ VACUUM and ALTER DISTKEY can't run concurrently on the same relation. 
  + If VACUUM is already running, then ALTER DISTKEY returns an error.
  + If ALTER DISTKEY is running, then background vacuum doesn't start on a relation.
  + If ALTER DISTKEY is running, then foreground vacuum returns an error.
+ You can only run one ALTER DISTKEY command on a relation at a time. 
+ The ALTER DISTKEY command is not supported for relations with interleaved sort keys. 
+ If the distribution style was previously defined as AUTO, then the relation is no longer a candidate for automatic table optimization. 
When specifying DISTSTYLE KEY, the data is distributed by the values in the DISTKEY column. For more information about DISTSTYLE, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).

ALTER DISTSTYLE AUTO  
A clause that changes the existing distribution style of a relation to AUTO.   
When you alter a distribution style to AUTO, the distribution style of the relation is set to the following:   
+ A small relation with DISTSTYLE ALL is converted to AUTO(ALL). 
+ A small relation with DISTSTYLE EVEN is converted to AUTO(ALL). 
+ A small relation with DISTSTYLE KEY is converted to AUTO(ALL). 
+ A large relation with DISTSTYLE ALL is converted to AUTO(EVEN). 
+ A large relation with DISTSTYLE EVEN is converted to AUTO(EVEN). 
+ A large relation with DISTSTYLE KEY is converted to AUTO(KEY) and the DISTKEY is preserved. In this case, Amazon Redshift makes no changes to the relation.
If Amazon Redshift determines that a new distribution style or key will improve the performance of queries, then Amazon Redshift might change the distribution style or key of your relation in the future. For example, Amazon Redshift might convert a relation with a DISTSTYLE of AUTO(KEY) to AUTO(EVEN), or vice versa. For more information about behavior when distribution keys are altered, including data redistribution and locks, go to [Amazon Redshift Advisor recommendations](https://docs.aws.amazon.com/redshift/latest/dg/advisor-recommendations.html#alter-diststyle-distkey-recommendation).  
For more information about DISTSTYLE AUTO, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).   
To view the distribution style of a relation, query the SVV\$1TABLE\$1INFO system catalog view. For more information, go to [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md). To view the Amazon Redshift Advisor recommendations for relations, query the SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS system catalog view. For more information, go to [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). To view the actions taken by Amazon Redshift, query the SVL\$1AUTO\$1WORKER\$1ACTION system catalog view. For more information, go to [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md). 

ALTER [COMPOUND] SORTKEY ( *column\$1name* [,...] )  
A clause that changes or adds the sort key used for a relation. ALTER SORTKEY isn't supported for temporary tables.  
When you alter a sort key, the compression encoding of columns in the new or original sort key can change. If no encoding is explicitly defined for the relation, then Amazon Redshift automatically assigns compression encodings as follows:  
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR or VARCHAR are assigned LZO compression.
Consider the following:  
+ You can define a maximum of 400 columns for a sort key per relation. 
+ You can alter an interleaved sort key to a compound sort key or no sort key. However, you can't alter a compound sort key to an interleaved sort key. 
+ If the sort key was previously defined as AUTO, then the relation is no longer a candidate for automatic table optimization. 
+ Amazon Redshift recommends using RAW encoding (no compression) for columns defined as sort keys. When you alter a column to choose it as a sort key, the column’s compression is changed to RAW compression (no compression). This can increase the amount of storage required by the relation. How much the relation size increases depend on the specific relation definition and relation contents. For more information about compression, go to [Compression encodings](c_Compression_encodings.md). 
When data is loaded into a relation, the data is loaded in the order of the sort key. When you alter the sort key, Amazon Redshift reorders the data. For more information about SORTKEY, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).

ALTER SORTKEY AUTO  
A clause that changes or adds the sort key of the target relation to AUTO. ALTER SORTKEY AUTO isn't supported for temporary tables.   
When you alter a sort key to AUTO, Amazon Redshift preserves the existing sort key of the relation.   
If Amazon Redshift determines that a new sort key will improve the performance of queries, then Amazon Redshift might change the sort key of your relation in the future.   
For more information about SORTKEY AUTO, go to [CREATE MATERIALIZED VIEW](materialized-view-create-sql-command.md).   
To view the sort key of a relation, query the SVV\$1TABLE\$1INFO system catalog view. For more information, go to [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md). To view the Amazon Redshift Advisor recommendations for relations, query the SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS system catalog view. For more information, go to [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). To view the actions taken by Amazon Redshift, query the SVL\$1AUTO\$1WORKER\$1ACTION system catalog view. For more information, go to [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md). 

ALTER SORTKEY NONE  
A clause that removes the sort key of the target relation.   
If the sort key was previously defined as AUTO, then the relation is no longer a candidate for automatic table optimization. 

ROW LEVEL SECURITY \$1 ON \$1 OFF \$1 [ CONJUNCTION TYPE \$1 AND \$1 OR \$1 ] [ FOR DATASHARES ]  
A clause that turns on or off row-level security for a relation.  
When row-level security is turned on for a relation, you can only read the rows that the row-level security policy permits you to access. When there isn't any policy granting you access to the relation, you can't see any rows from the relation. Only superusers and users or roles that have the `sys:secadmin` role can set the ROW LEVEL SECURITY clause. For more information, see [Row-level security](t_rls.md).  
+ [ CONJUNCTION TYPE \$1 AND \$1 OR \$1 ] 

  A clause that allows you to choose the conjunction type of row-level security policy for a relation. When multiple row-level security policies are attached to a relation, you can combine the policies with the AND or OR clause. By default, Amazon Redshift combines RLS policies with the AND clause. Superusers, users, or roles that have the `sys:secadmin` role can use this clause to define the conjunction type of row-level security policy for a relation. For more information, see [Combining multiple policies per user](t_rls_combine_policies.md). 
+ FOR DATASHARES

   A clause that determines whether an RLS-protected relation can be accessed over datashares. By default, an RLS-protected relation can’t be accessed over a datashare. An ALTER MATERIALIZED VIEW ROW LEVEL SECURITY command run with this clause only affects the relation’s datashare accessibility property. The ROW LEVEL SECURITY property isn’t changed.

   If you make an RLS-protected relation accessible over datashares, the relation doesn’t have row-level security in the consumer-side datashared database. The relation retains its RLS property on the producer side. 

## Examples
<a name="r_ALTER_MATERIALIZED_VIEW-examples"></a>

The following example enables the `tickets_mv` materialized view to be automatically refreshed.

```
ALTER MATERIALIZED VIEW tickets_mv AUTO REFRESH YES
```

# DISTSTYLE and SORTKEY examples for ALTER MATERIALIZED VIEW
<a name="r_ALTER_MATERIALIZED_VIEW-DISTSTYLE-SORTKEY-examples"></a>

The examples in this topic show you how to perform DISTSTYLE and SORTKEY changes, using ALTER MATERIALIZED VIEW.

The following example queries show how to alter a DISTSTYLE KEY DISTKEY column using a sample base table:

```
CREATE TABLE base_inventory(
  inv_date_sk int4 NOT NULL,
  inv_item_sk int4 NOT NULL,
  inv_warehouse_sk int4 NOT NULL,
  inv_quantity_on_hand int4
);

INSERT INTO base_inventory VALUES(1,1,1,1);

CREATE materialized VIEW inventory diststyle even AS SELECT * FROM base_inventory;
SELECT "table", diststyle FROM svv_table_info WHERE "table" = 'inventory';

ALTER materialized VIEW inventory ALTER diststyle KEY distkey inv_warehouse_sk;
SELECT "table", diststyle FROM svv_table_info WHERE "table" = 'inventory';

ALTER materialized VIEW inventory ALTER distkey inv_item_sk;
SELECT "table", diststyle FROM svv_table_info WHERE "table" = 'inventory';

DROP TABLE base_inventory CASCADE;
```

Alter a materialized view to DISTSTYLE ALL:

```
CREATE TABLE base_inventory(
  inv_date_sk int4 NOT NULL,
  inv_item_sk int4 NOT NULL,
  inv_warehouse_sk int4 NOT NULL,
  inv_quantity_on_hand int4
);

INSERT INTO base_inventory VALUES(1,1,1,1);

CREATE materialized VIEW inventory diststyle even AS SELECT * FROM base_inventory;
SELECT "table", diststyle FROM svv_table_info WHERE "table" = 'inventory';

ALTER MATERIALIZED VIEW inventory ALTER diststyle ALL;
SELECT "table", diststyle FROM svv_table_info WHERE "table" = 'inventory';

DROP TABLE base_inventory CASCADE;
```

The following commands show ALTER MATERIALIZED VIEW SORTKEY examples using a sample base table:

```
CREATE TABLE base_inventory (c0 int, c1 int);

INSERT INTO base_inventory VALUES(1,1);

CREATE materialized VIEW inventory interleaved sortkey(c0, c1) AS SELECT * FROM base_inventory;
SELECT "table", sortkey1 FROM svv_table_info WHERE "table" = 'inventory';

ALTER materialized VIEW inventory ALTER sortkey(c0, c1);
SELECT "table", diststyle, sortkey_num FROM svv_table_info WHERE "table" = 'inventory';

ALTER materialized VIEW inventory ALTER sortkey NONE;
SELECT "table", diststyle, sortkey_num FROM svv_table_info WHERE "table" = 'inventory';

ALTER materialized VIEW inventory ALTER sortkey(c0);
SELECT "table", diststyle, sortkey_num FROM svv_table_info WHERE "table" = 'inventory';

DROP TABLE base_inventory CASCADE;
```

# ALTER RLS POLICY
<a name="r_ALTER_RLS_POLICY"></a>

Alter an existing row-level security policy on a table.

Superusers and users or roles that have the `sys:secadmin` role can alter a policy.

## Syntax
<a name="r_ALTER_RLS_POLICY-synopsis"></a>

```
ALTER RLS POLICY
{ policy_name | database_name.policy_name }
USING ( using_predicate_exp );
```

## Parameters
<a name="r_ALTER_RLS_POLICY-parameters"></a>

 *policy\$1name*   
The name of the policy.

database\$1name  
The name of the database from where the policy is created. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

USING (* using\$1predicate\$1exp *)  
Specifies a filter that is applied to the WHERE clause of the query. Amazon Redshift applies a policy predicate before the query-level user predicates. For example, **current\$1user = ‘joe’ and price > 10** limits Joe to see only records with the price greater than \$110.  
The expression has access to the variables declared in the WITH clause of the CREATE RLS POLICY statement that was used to create the policy with name policy\$1name.

For the usage of ALTER RLS POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

## Examples
<a name="r_ALTER_RLS_POLICY-examples"></a>

The following example alters a RLS policy.

```
-- First create an RLS policy that limits access to rows where catgroup is 'concerts'.
CREATE RLS POLICY policy_concerts
WITH (catgroup VARCHAR(10))
USING (catgroup = 'concerts');

-- Then, alter the RLS policy to only show rows where catgroup is 'piano concerts'.
ALTER RLS POLICY policy_concerts
USING (catgroup = 'piano concerts');
```

# ALTER ROLE
<a name="r_ALTER_ROLE"></a>

Renames a role or changes the owner. For a list of Amazon Redshift system-defined roles, see [Amazon Redshift system-defined roles](r_roles-default.md).

## Required permissions
<a name="r_ALTER_ROLE-privileges"></a>

Following are the required permissions for ALTER ROLE:
+ Superuser
+ Users with the ALTER ROLE permissions

## Syntax
<a name="r_ALTER_ROLE-synopsis"></a>

```
ALTER ROLE role [ WITH ]
  { { RENAME TO role } | { OWNER TO user_name } }[, ...]
  [ EXTERNALID TO external_id ]
```

## Parameters
<a name="r_ALTER_ROLE-parameters"></a>

 *role*   
The name of the role to be altered.

RENAME TO  
A new name for the role.

OWNER TO *user\$1name*  
A new owner for the role. 

EXTERNALID TO *external\$1id*  
A new external ID for the role, which is associated with an identity provider. For more information, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html).

## Examples
<a name="r_ALTER_ROLE-examples"></a>

The following example changes the name of a role from `sample_role1` to `sample_role2`.

```
ALTER ROLE sample_role1 RENAME TO sample_role2;
```

The following example changes the owner of the role.

```
ALTER ROLE sample_role1 WITH OWNER TO user1
```

The syntax of the ALTER ROLE is similar to ALTER PROCEDURE following.

```
ALTER PROCEDURE first_quarter_revenue(bigint, numeric) RENAME TO quarterly_revenue;
```

The following example changes the owner of a procedure to `etl_user`.

```
ALTER PROCEDURE quarterly_revenue(bigint, numeric) OWNER TO etl_user;
```

The following example updates a role `sample_role1` with a new external ID that is associated with an identity provider.

```
ALTER ROLE sample_role1 EXTERNALID TO "XYZ456";
```

# ALTER PROCEDURE
<a name="r_ALTER_PROCEDURE"></a>

Renames a procedure or changes the owner. Both the procedure name and data types, or signature, are required. Only the owner or a superuser can rename a procedure. Only a superuser can change the owner of a procedure. 

## Syntax
<a name="r_ALTER_PROCEDURE-synopsis"></a>

```
ALTER PROCEDURE sp_name [ ( [ [ argname ] [ argmode ] argtype [, ...] ] ) ]
    RENAME TO new_name
```

```
ALTER PROCEDURE sp_name [ ( [ [ argname ] [ argmode ] argtype [, ...] ] ) ]
    OWNER TO { new_owner | CURRENT_USER | SESSION_USER }
```

## Parameters
<a name="r_ALTER_PROCEDURE-parameters"></a>

 *sp\$1name*   
The name of the procedure to be altered. Either specify just the name of the procedure in the current search path, or use the format `schema_name.sp_procedure_name` to use a specific schema.

*[argname] [argmode] argtype*   
A list of argument names, argument modes, and data types. Only the input data types are required, which are used to identify the stored procedure. Alternatively, you can provide the full signature used to create the procedure including the input and output parameters with their modes.

 *new\$1name*   
A new name for the stored procedure. 

*new\$1owner* \$1 CURRENT\$1USER \$1 SESSION\$1USER  
A new owner for the stored procedure. 

## Examples
<a name="r_ALTER_PROCEDURE-examples"></a>

The following example changes the name of a procedure from `first_quarter_revenue` to `quarterly_revenue`.

```
ALTER PROCEDURE first_quarter_revenue(volume INOUT bigint, at_price IN numeric,
 result OUT int) RENAME TO quarterly_revenue;
```

This example is equivalent to the following.

```
ALTER PROCEDURE first_quarter_revenue(bigint, numeric) RENAME TO quarterly_revenue;
```

The following example changes the owner of a procedure to `etl_user`.

```
ALTER PROCEDURE quarterly_revenue(bigint, numeric) OWNER TO etl_user;
```

# ALTER SCHEMA
<a name="r_ALTER_SCHEMA"></a>

Changes the definition of an existing schema. Use this command to rename a schema or change the owner of a schema. For example, rename an existing schema to preserve a backup copy of that schema when you plan to create a new version of that schema. For more information about schemas, see [CREATE SCHEMA](r_CREATE_SCHEMA.md).

To view the configured schema quotas, see [SVV\$1SCHEMA\$1QUOTA\$1STATE](r_SVV_SCHEMA_QUOTA_STATE.md).

To view the records where schema quotas were exceeded, see [STL\$1SCHEMA\$1QUOTA\$1VIOLATIONS](r_STL_SCHEMA_QUOTA_VIOLATIONS.md).

## Required privileges
<a name="r_ALTER_SCHEMA-privileges"></a>

Following are required privileges for ALTER SCHEMA:
+ Superuser
+ User with the ALTER SCHEMA privilege
+ Schema owner

When you change a schema name, note that objects using the old name, such as stored procedures or materialized views, must be updated to use the new name.

## Syntax
<a name="r_ALTER_SCHEMA-synopsis"></a>

```
ALTER SCHEMA schema_name
{
RENAME TO new_name |
OWNER TO new_owner |
QUOTA { quota [MB | GB | TB] | UNLIMITED }
}
```

## Parameters
<a name="r_ALTER_SCHEMA-parameters"></a>

 *schema\$1name*   
The name of the database schema to be altered. 

RENAME TO   
A clause that renames the schema. 

 *new\$1name*   
The new name of the schema. For more information about valid names, see [Names and identifiers](r_names.md). 

OWNER TO   
A clause that changes the owner of the schema. 

 *new\$1owner*   
The new owner of the schema. 

QUOTA   
The maximum amount of disk space that the specified schema can use. This space is the collective size of all tables under the specified schema. Amazon Redshift converts the selected value to megabytes. Gigabytes is the default unit of measurement when you don't specify a value.   
For more information about configuring schema quotas, see [CREATE SCHEMA](r_CREATE_SCHEMA.md).

## Examples
<a name="r_ALTER_SCHEMA-examples"></a>

The following example renames the SALES schema to US\$1SALES.

```
alter schema sales
rename to us_sales;
```

The following example gives ownership of the US\$1SALES schema to the user DWUSER.

```
alter schema us_sales
owner to dwuser;
```

The following example changes the quota to 300 GB and removes the quota.

```
alter schema us_sales QUOTA 300 GB;
alter schema us_sales QUOTA UNLIMITED;
```

# ALTER SYSTEM
<a name="r_ALTER_SYSTEM"></a>

Changes a system-level configuration option for the Amazon Redshift cluster or Redshift Serverless workgroup.

## Required privileges
<a name="r_ALTER_SYSTEM-privileges"></a>

One of the following user types can run the ALTER SYSTEM command:
+ Superuser
+ Admin user

## Syntax
<a name="r_ALTER_SYSTEM-synopsis"></a>

```
ALTER SYSTEM SET system-level-configuration = {true| t | on | false | f | off}
```

## Parameters
<a name="r_ALTER_SYSTEM-parameters"></a>

 *system-level-configuration*   
A system-level configuration. Valid value: `data_catalog_auto_mount` and `metadata_security`.

\$1true\$1 t \$1 on \$1 false \$1 f \$1 off\$1   
A value to activate or deactivate the system-level configuration. A `true`, `t`, or `on` indicates to activate the configuration. A `false`, `f`, or `off` indicates to deactivate the configuration.

## Usage notes
<a name="r_ALTER_SYSTEM-usage-notes"></a>

For a provisioned cluster, changes to `data_catalog_auto_mount` take effect on the next reboot of the cluster. For more information, see [Rebooting a cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-clusters-console.html#reboot-cluster) in the *Amazon Redshift Management Guide*.

For a serverliess workgroup, changes to `data_catalog_auto_mount` do not take effect immediately.

## Examples
<a name="r_ALTER_SYSTEM-examples"></a>

The following example turns on automounting the AWS Glue Data Catalog.

```
ALTER SYSTEM SET data_catalog_auto_mount = true;
```

The following example turns on metadata security.

```
ALTER SYSTEM SET metadata_security = true;
```

### Setting a default identity namespace
<a name="r_ALTER_SYSTEM-identity"></a>

This example is specific to working with an identity provider. You can integrate Redshift with IAM Identity Center and an identity provider to centralize identity management for Redshift and other AWS services.

The following sample shows how to set the default identity namespace for the system. Doing this subsequently makes it more simple to run GRANT and CREATE statements, because you don't have to include the namespace as a prefix for each identity.

```
ALTER SYSTEM SET default_identity_namespace = 'MYCO';
```

After running the command, you can run statements like the following:

```
GRANT SELECT ON TABLE mytable TO alice;

GRANT UPDATE ON TABLE mytable TO salesrole;
               
CREATE USER bob password 'md50c983d1a624280812631c5389e60d48c';
```

The effect of setting the default identity namespace is that each identity doesn't require it as a prefix. In this example, `alice` is replaced with `MYCO:alice`. This happens with any identity included. For more information about using an identity provider with Redshift, see [Connect Redshift with IAM Identity Center to give users a single sign-on experience](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-idp-connect.html).

For more information about settings that pertain to Redshift configuration with IAM Identity Center, see [SET](r_SET.md) and [ALTER IDENTITY PROVIDER](r_ALTER_IDENTITY_PROVIDER.md).

# ALTER TABLE
<a name="r_ALTER_TABLE"></a>

This command changes the definition of a Amazon Redshift table or Amazon Redshift Spectrum external table. This command updates the values and properties set by [CREATE TABLE](r_CREATE_TABLE_NEW.md) or [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md). You can use ALTER TABLE on a view for row-level security (RLS).

You can't run ALTER TABLE on an external table within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

ALTER TABLE locks the table for read and write operations until the transaction enclosing the ALTER TABLE operation completes, unless it's specifically stated in the documentation that you can query data or perform other operations on the table while it is altered.

## Required privileges
<a name="r_ALTER_TABLE-privileges"></a>

The user that alters a table needs the proper privilege for the command to succeed. Depending on the ALTER TABLE command, one of the following privileges is required.
+ Superuser
+ Users with the ALTER TABLE privilege
+ Table owner with the USAGE privilege on the schema

## Syntax
<a name="r_ALTER_TABLE-synopsis"></a>

```
ALTER TABLE table_name
{
ADD table_constraint
| DROP CONSTRAINT constraint_name [ RESTRICT | CASCADE ]
| OWNER TO new_owner
| RENAME TO new_name
| RENAME COLUMN column_name TO new_name
| ALTER COLUMN column_name TYPE updated_varchar_data_type_size
| ALTER COLUMN column_name ENCODE new_encode_type
| ALTER COLUMN column_name ENCODE encode_type,
| ALTER COLUMN column_name ENCODE encode_type, .....;
| ALTER DISTKEY column_name
| ALTER DISTSTYLE ALL
| ALTER DISTSTYLE EVEN
| ALTER DISTSTYLE KEY DISTKEY column_name
| ALTER DISTSTYLE AUTO
| ALTER [COMPOUND] SORTKEY ( column_name [,...] )
| ALTER SORTKEY AUTO
| ALTER SORTKEY NONE
| ALTER ENCODE AUTO
| ADD [ COLUMN ] column_name column_type
  [ DEFAULT default_expr ]
  [ ENCODE encoding ]
  [ NOT NULL | NULL ]
  [ COLLATE { CASE_SENSITIVE | CS | CASE_INSENSITIVE | CI } ] |
| DROP [ COLUMN ] column_name [ RESTRICT | CASCADE ] 
| ROW LEVEL SECURITY { ON | OFF } [ CONJUNCTION TYPE { AND | OR } ] [ FOR DATASHARES ]
| MASKING { ON | OFF } FOR DATASHARES }

where table_constraint is:

[ CONSTRAINT constraint_name ]
{ UNIQUE ( column_name [, ... ] )
| PRIMARY KEY ( column_name [, ... ] )
| FOREIGN KEY (column_name [, ... ] )
   REFERENCES  reftable [ ( refcolumn ) ]}

The following options apply only to external tables:

SET LOCATION { 's3://bucket/folder/' | 's3://bucket/manifest_file' }
| SET FILE FORMAT format |
| SET TABLE PROPERTIES ('property_name'='property_value')
| PARTITION ( partition_column=partition_value [, ...] )
  SET LOCATION { 's3://bucket/folder' |'s3://bucket/manifest_file' }
| ADD [IF NOT EXISTS]
    PARTITION ( partition_column=partition_value [, ...] ) LOCATION { 's3://bucket/folder' |'s3://bucket/manifest_file' }
    [, ... ]
| DROP PARTITION ( partition_column=partition_value [, ...] )
```

To reduce the time to run the ALTER TABLE command, you can combine some clauses of the ALTER TABLE command.

Amazon Redshift supports the following combinations of the ALTER TABLE clauses:

```
ALTER TABLE tablename ALTER SORTKEY (column_list), ALTER DISTKEY column_Id;
ALTER TABLE tablename ALTER DISTKEY column_Id, ALTER SORTKEY (column_list);
ALTER TABLE tablename ALTER SORTKEY (column_list), ALTER DISTSTYLE ALL;
ALTER TABLE tablename ALTER DISTSTYLE ALL, ALTER SORTKEY (column_list);
```

## Parameters
<a name="r_ALTER_TABLE-parameters"></a>

 *table\$1name*   
The name of the table to alter. Either specify just the name of the table, or use the format *schema\$1name.table\$1name* to use a specific schema. External tables must be qualified by an external schema name. You can also specify a view name if you're using the ALTER TABLE statement to rename a view or change its owner. The maximum length for the table name is 127 bytes; longer names are truncated to 127 bytes. You can use UTF-8 multibyte characters up to a maximum of four bytes. For more information about valid names, see [Names and identifiers](r_names.md).

ADD *table\$1constraint*   
A clause that adds the specified constraint to the table. For descriptions of valid *table\$1constraint* values, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).  
You can't add a primary-key constraint to a nullable column. If the column was originally created with the NOT NULL constraint, you can add the primary-key constraint.

DROP CONSTRAINT *constraint\$1name*   
A clause that drops the named constraint from the table. To drop a constraint, specify the constraint name, not the constraint type. To view table constraint names, run the following query.  

```
select constraint_name, constraint_type
from information_schema.table_constraints;
```

RESTRICT   
A clause that removes only the specified constraint. RESTRICT is an option for DROP CONSTRAINT. RESTRICT can't be used with CASCADE. 

CASCADE   
A clause that removes the specified constraint and anything dependent on that constraint. CASCADE is an option for DROP CONSTRAINT. CASCADE can't be used with RESTRICT.

OWNER TO *new\$1owner*   
A clause that changes the owner of the table (or view) to the *new\$1owner* value.

RENAME TO *new\$1name*   
A clause that renames a table (or view) to the value specified in *new\$1name*. The maximum table name length is 127 bytes; longer names are truncated to 127 bytes.  
You can't rename a permanent table to a name that begins with '\$1'. A table name beginning with '\$1' indicates a temporary table.  
You can't rename an external table.

ALTER COLUMN *column\$1name* TYPE *updated\$1varchar\$1data\$1type\$1size*   
A clause that changes the size of a column defined as a VARCHAR data type. This clause only supports altering the size of a VARCHAR data type. Consider the following limitations:  
+ You can't alter a column with compression encodings BYTEDICT, RUNLENGTH, TEXT255, or TEXT32K. 
+ You can't decrease the size less than maximum size of existing data. 
+ You can't alter columns with default values. 
+ You can't alter columns with UNIQUE, PRIMARY KEY, or FOREIGN KEY. 
+ You can't alter columns within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

ALTER COLUMN *column\$1name* ENCODE *new\$1encode\$1type*   
A clause that changes the compression encoding of a column. If you specify compression encoding for a column, the table is no longer set to ENCODE AUTO. For information on compression encoding, see [Column compression to reduce the size of stored data](t_Compressing_data_on_disk.md).   
When you change compression encoding for a column, the table remains available to query.  
Consider the following limitations:  
+ You can't alter a column to the same encoding as currently defined for the column. 
+ You can't alter the encoding for a column in a table with an interleaved sortkey. 

ALTER COLUMN *column\$1name* ENCODE *encode\$1type*, ALTER COLUMN *column\$1name* ENCODE *encode\$1type*, .....;   
A clause that changes the compression encoding of multiple columns in a single command. For information on compression encoding, see [Column compression to reduce the size of stored data](t_Compressing_data_on_disk.md).  
When you change compression encoding for a column, the table remains available to query.  
 Consider the following limitations:  
+ You can't alter a column to the same or different encoding type multiple times in a single command.
+ You can't alter a column to the same encoding as currently defined for the column. 
+ You can't alter the encoding for a column in a table with an interleaved sortkey. 

ALTER DISTSTYLE ALL  
A clause that changes the existing distribution style of a table to `ALL`. Consider the following:  
+ An ALTER DISTSTYLE, ALTER SORTKEY, and VACUUM can't run concurrently on the same table. 
  + If VACUUM is currently running, then running ALTER DISTSTYLE ALL returns an error. 
  + If ALTER DISTSTYLE ALL is running, then a background vacuum doesn't start on a table. 
+ The ALTER DISTSTYLE ALL command is not supported for tables with interleaved sort keys and temporary tables.
+ If the distribution style was previously defined as AUTO, then the table is no longer a candidate for automatic table optimization. 
For more information about DISTSTYLE ALL, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

ALTER DISTSTYLE EVEN  
A clause that changes the existing distribution style of a table to `EVEN`. Consider the following:  
+ An ALTER DISTSYTLE, ALTER SORTKEY, and VACUUM can't run concurrently on the same table. 
  + If VACUUM is currently running, then running ALTER DISTSTYLE EVEN returns an error. 
  + If ALTER DISTSTYLE EVEN is running, then a background vacuum doesn't start on a table. 
+ The ALTER DISTSTYLE EVEN command is not supported for tables with interleaved sort keys and temporary tables.
+ If the distribution style was previously defined as AUTO, then the table is no longer a candidate for automatic table optimization. 
For more information about DISTSTYLE EVEN, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

ALTER DISTKEY *column\$1name* or ALTER DISTSTYLE KEY DISTKEY *column\$1name*  
A clause that changes the column used as the distribution key of a table. Consider the following:  
+ VACUUM and ALTER DISTKEY can't run concurrently on the same table. 
  + If VACUUM is already running, then ALTER DISTKEY returns an error.
  + If ALTER DISTKEY is running, then background vacuum doesn't start on a table.
  + If ALTER DISTKEY is running, then foreground vacuum returns an error.
+ You can only run one ALTER DISTKEY command on a table at a time. 
+ The ALTER DISTKEY command is not supported for tables with interleaved sort keys. 
+ If the distribution style was previously defined as AUTO, then the table is no longer a candidate for automatic table optimization. 
When specifying DISTSTYLE KEY, the data is distributed by the values in the DISTKEY column. For more information about DISTSTYLE, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

ALTER DISTSTYLE AUTO  
A clause that changes the existing distribution style of a table to AUTO.   
When you alter a distribution style to AUTO, the distribution style of the table is set to the following:   
+ A small table with DISTSTYLE ALL is converted to AUTO(ALL). 
+ A small table with DISTSTYLE EVEN is converted to AUTO(ALL). 
+ A small table with DISTSTYLE KEY is converted to AUTO(ALL). 
+ A large table with DISTSTYLE ALL is converted to AUTO(EVEN). 
+ A large table with DISTSTYLE EVEN is converted to AUTO(EVEN). 
+ A large table with DISTSTYLE KEY is converted to AUTO(KEY) and the DISTKEY is preserved. In this case, Amazon Redshift makes no changes to the table.
If Amazon Redshift determines that a new distribution style or key will improve the performance of queries, then Amazon Redshift might change the distribution style or key of your table in the future. For example, Amazon Redshift might convert a table with a DISTSTYLE of AUTO(KEY) to AUTO(EVEN), or vice versa. For more information about behavior when distribution keys are altered, including data redistribution and locks, see [Amazon Redshift Advisor recommendations](https://docs.aws.amazon.com/redshift/latest/dg/advisor-recommendations.html#alter-diststyle-distkey-recommendation).  
For more information about DISTSTYLE AUTO, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).   
To view the distribution style of a table, query the SVV\$1TABLE\$1INFO system catalog view. For more information, see [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md). To view the Amazon Redshift Advisor recommendations for tables, query the SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS system catalog view. For more information, see [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). To view the actions taken by Amazon Redshift, query the SVL\$1AUTO\$1WORKER\$1ACTION system catalog view. For more information, see [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md). 

ALTER [COMPOUND] SORTKEY ( *column\$1name* [,...] )  
A clause that changes or adds the sort key used for a table. ALTER SORTKEY isn't supported for temporary tables.  
When you alter a sort key, the compression encoding of columns in the new or original sort key can change. If no encoding is explicitly defined for the table, then Amazon Redshift automatically assigns compression encodings as follows:  
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR or VARCHAR are assigned LZO compression.
Consider the following:  
+ You can define a maximum of 400 columns for a sort key per table. 
+ You can alter an interleaved sort key to a compound sort key or no sort key. However, you can't alter a compound sort key to an interleaved sort key. 
+ If the sort key was previously defined as AUTO, then the table is no longer a candidate for automatic table optimization. 
+ Amazon Redshift recommends using RAW encoding (no compression) for columns defined as sort keys. When you alter a column to choose it as a sort key, the column’s compression is changed to RAW compression (no compression). This can increase the amount of storage required by the table. How much the table size increases depend on the specific table definition and table contents. For more information about compression, see [Compression encodings](c_Compression_encodings.md) 
When data is loaded into a table, the data is loaded in the order of the sort key. When you alter the sort key, Amazon Redshift reorders the data. For more information about SORTKEY, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

ALTER SORTKEY AUTO  
A clause that changes or adds the sort key of the target table to AUTO. ALTER SORTKEY AUTO isn't supported for temporary tables.   
When you alter a sort key to AUTO, Amazon Redshift preserves the existing sort key of the table.   
If Amazon Redshift determines that a new sort key will improve the performance of queries, then Amazon Redshift might change the sort key of your table in the future.   
For more information about SORTKEY AUTO, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).   
To view the sort key of a table, query the SVV\$1TABLE\$1INFO system catalog view. For more information, see [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md). To view the Amazon Redshift Advisor recommendations for tables, query the SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS system catalog view. For more information, see [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). To view the actions taken by Amazon Redshift, query the SVL\$1AUTO\$1WORKER\$1ACTION system catalog view. For more information, see [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md). 

ALTER SORTKEY NONE  
A clause that removes the sort key of the target table.   
If the sort key was previously defined as AUTO, then the table is no longer a candidate for automatic table optimization. 

ALTER ENCODE AUTO  
A clause that changes the encoding type of the target table columns to AUTO. When you alter encoding to AUTO, Amazon Redshift preserves the existing encoding type of the columns in the table. Then, if Amazon Redshift determines that a new encoding type can improve query performance, Amazon Redshift can change the encoding type of the table columns.   
If you alter one or more columns to specify an encoding, Amazon Redshift no longer automatically adjusts encoding for all columns in the table. The columns retain the current encode settings.  
The following actions don't affect the ENCODE AUTO setting for the table:   
+ Renaming the table.
+ Altering the DISTSTYLE or SORTKEY setting for the table.
+ Adding or dropping a column with an ENCODE setting.
+ Using the COMPUPDATE option of the COPY command. For more information, see [Data load operations](copy-parameters-data-load.md).
To view the encoding of a table, query the SVV\$1TABLE\$1INFO system catalog view. For more information, see [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md).

RENAME COLUMN *column\$1name* TO *new\$1name*   
A clause that renames a column to the value specified in *new\$1name*. The maximum column name length is 127 bytes; longer names are truncated to 127 bytes. For more information about valid names, see [Names and identifiers](r_names.md).

ADD [ COLUMN ] *column\$1name*   
A clause that adds a column with the specified name to the table. You can add only one column in each ALTER TABLE statement.  
You can't add a column that is the distribution key (DISTKEY) or a sort key (SORTKEY) of the table.  
 You can't use an ALTER TABLE ADD COLUMN command to modify the following table and column attributes:   
+ UNIQUE
+ PRIMARY KEY
+ REFERENCES (foreign key)
+ IDENTITY or GENERATED BY DEFAULT AS IDENTITY
The maximum column name length is 127 bytes; longer names are truncated to 127 bytes. The maximum number of columns you can define in a single table is 1,600.  
The following restrictions apply when adding a column to an external table:  
+ You can't add a column to an external table with the column constraints DEFAULT, ENCODE, NOT NULL, or NULL. 
+ You can't add columns to an external table that's defined using the AVRO file format. 
+ If pseudocolumns are enabled, the maximum number of columns that you can define in a single external table is 1,598. If pseudocolumns aren't enabled, the maximum number of columns that you can define in a single table is 1,600. 
For more information, see [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md).

 *column\$1type*   
The data type of the column being added. For CHAR and VARCHAR columns, you can use the MAX keyword instead of declaring a maximum length. MAX sets the maximum length to 4,096 bytes for CHAR or 65,535 bytes for VARCHAR. The maximum size of a GEOMETRY object is 1,048,447 bytes.   
For information about the data types that Amazon Redshift supports, see [Data types](c_Supported_data_types.md).

DEFAULT *default\$1expr*   <a name="alter-table-default"></a>
A clause that assigns a default data value for the column. The data type of *default\$1expr* must match the data type of the column. The DEFAULT value must be a variable-free expression. Subqueries, cross-references to other columns in the current table, and user-defined functions aren't allowed.  
The *default\$1expr* is used in any INSERT operation that doesn't specify a value for the column. If no default value is specified, the default value for the column is null.  
If a COPY operation encounters a null field on a column that has a DEFAULT value and a NOT NULL constraint, the COPY command inserts the value of the *default\$1expr*.   
DEFAULT isn't supported for external tables.

ENCODE *encoding*   
The compression encoding for a column. By default, Amazon Redshift automatically manages compression encoding for all columns in a table if you don't specify compression encoding for any column in the table or if you specify the ENCODE AUTO option for the table.  
If you specify compression encoding for any column in the table or if you don't specify the ENCODE AUTO option for the table, Amazon Redshift automatically assigns compression encoding to columns for which you don't specify compression encoding as follows:  
+ All columns in temporary tables are assigned RAW compression by default.
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, DOUBLE PRECISION, GEOMETRY, or GEOGRAPHY data type are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR, VARCHAR, or VARBYTE are assigned LZO compression.
If you don't want a column to be compressed, explicitly specify RAW encoding.
The following [compression encodings](c_Compression_encodings.md#compression-encoding-list) are supported:  
+ AZ64
+ BYTEDICT
+ DELTA
+ DELTA32K
+ LZO
+ MOSTLY8
+ MOSTLY16
+ MOSTLY32
+ RAW (no compression)
+ RUNLENGTH
+ TEXT255
+ TEXT32K
+ ZSTD
ENCODE isn't supported for external tables.

NOT NULL \$1 NULL   
NOT NULL specifies that the column isn't allowed to contain null values. NULL, the default, specifies that the column accepts null values.  
NOT NULL and NULL aren't supported for external tables.

COLLATE \$1 CASE\$1SENSITIVE \$1 CS \$1 CASE\$1INSENSITIVE \$1 CI \$1  
A clause that specifies whether string search or comparison on the column is case sensitive or case insensitive. The default value is the same as the current case sensitivity configuration of the database.  
To find the database collation information, use the following command:  

```
SELECT db_collation();
                     
db_collation
----------------
 case_sensitive
(1 row)
```
CASE\$1SENSITIVE and CS are interchangeable and yield the same results. Similarly, CASE\$1INSENSITIVE and CI are interchangeable and yield the same results.

DROP [ COLUMN ] *column\$1name*   
The name of the column to delete from the table.  
You can't drop the last column in a table. A table must have at least one column.  
You can't drop a column that is the distribution key (DISTKEY) or a sort key (SORTKEY) of the table. The default behavior for DROP COLUMN is RESTRICT if the column has any dependent objects, such as a view, primary key, foreign key, or UNIQUE restriction.  
The following restrictions apply when dropping a column from an external table:  
+ You can't drop a column from an external table if the column is used as a partition.
+ You can't drop a column from an external table that is defined using the AVRO file format. 
+ RESTRICT and CASCADE are ignored for external tables.
+ You can't drop the columns of the policy table referenced inside the policy definition unless you drop or detach the policy. This also applies when the CASCADE option is specified. You can drop other columns in the policy table.
For more information, see [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md).

RESTRICT   
When used with DROP COLUMN, RESTRICT means that column to be dropped isn't dropped, in these cases:  
+ If a defined view references the column that is being dropped
+ If a foreign key references the column
+ If the column takes part in a multipart key
RESTRICT can't be used with CASCADE.  
RESTRICT and CASCADE are ignored for external tables.

CASCADE   
When used with DROP COLUMN, removes the specified column and anything dependent on that column. CASCADE can't be used with RESTRICT.  
RESTRICT and CASCADE are ignored for external tables.

The following options apply only to external tables.

SET LOCATION \$1 's3://*bucket/folder*/' \$1 's3://*bucket/manifest\$1file*' \$1  
The path to the Amazon S3 folder that contains the data files or a manifest file that contains a list of Amazon S3 object paths. The buckets must be in the same AWS Region as the Amazon Redshift cluster. For a list of supported AWS Regions, see [Amazon Redshift Spectrum limitations](c-spectrum-considerations.md). For more information about using a manifest file, see LOCATION in the CREATE EXTERNAL TABLE [Parameters](r_CREATE_EXTERNAL_TABLE.md#r_CREATE_EXTERNAL_TABLE-parameters) reference.

SET FILE FORMAT *format*  
The file format for external data files.  
Valid formats are as follows:  
+ AVRO 
+ PARQUET
+ RCFILE
+ SEQUENCEFILE
+ TEXTFILE 

SET TABLE PROPERTIES ( '*property\$1name*'='*property\$1value*')   
A clause that sets the table definition for table properties for an external table.   
Table properties are case-sensitive.  
'numRows'='*row\$1count*'   
A property that sets the numRows value for the table definition. To explicitly update an external table's statistics, set the numRows property to indicate the size of the table. Amazon Redshift doesn't analyze external tables to generate the table statistics that the query optimizer uses to generate a query plan. If table statistics aren't set for an external table, Amazon Redshift generates a query execution plan. This plan is based on an assumption that external tables are the larger tables and local tables are the smaller tables.  
'skip.header.line.count'='*line\$1count*'  
A property that sets number of rows to skip at the beginning of each source file.

PARTITION ( *partition\$1column*=*partition\$1value* [, ...] SET LOCATION \$1 's3://*bucket*/*folder*' \$1 's3://*bucket*/*manifest\$1file*' \$1  
A clause that sets a new location for one or more partition columns. 

ADD [ IF NOT EXISTS ] PARTITION ( *partition\$1column*=*partition\$1value* [, ...] ) LOCATION \$1 's3://*bucket*/*folder*' \$1 's3://*bucket*/*manifest\$1file*' \$1 [, ... ]  
A clause that adds one or more partitions. You can specify multiple PARTITION clauses using a single ALTER TABLE … ADD statement.  
If you use the AWS Glue catalog, you can add up to 100 partitions using a single ALTER TABLE statement.
The IF NOT EXISTS clause indicates that if the specified partition already exists, the command should make no changes. It also indicates that the command should return a message that the partition exists, rather than terminating with an error. This clause is useful when scripting, so the script doesn’t fail if ALTER TABLE tries to add a partition that already exists. 

DROP PARTITION (*partition\$1column*=*partition\$1value* [, ...] )   
A clause that drops the specified partition. Dropping a partition alters only the external table metadata. The data on Amazon S3 isn't affected.

ROW LEVEL SECURITY \$1 ON \$1 OFF \$1 [ CONJUNCTION TYPE \$1 AND \$1 OR \$1 ] [ FOR DATASHARES ]  
A clause that turns on or off row-level security for a relation.  
When row-level security is turned on for a relation, you can only read the rows that the row-level security policy permits you to access. When there isn't any policy granting you access to the relation, you can't see any rows from the relation. Only superusers and users or roles that have the `sys:secadmin` role can set the ROW LEVEL SECURITY clause. For more information, see [Row-level security](t_rls.md). This statement is supported on the connected database or on a database with amazon redshift federated permissions. FOR DATASHARES clause is not supported on a database with Amazon Redshift federated permissions.  
+ [ CONJUNCTION TYPE \$1 AND \$1 OR \$1 ] 

  A clause that allows you to choose the conjunction type of row-level security policy for a relation. When multiple row-level security policies are attached to a relation, you can combine the policies with the AND or OR clause. By default, Amazon Redshift combines RLS policies with the AND clause. Superusers, users, or roles that have the `sys:secadmin` role can use this clause to define the conjunction type of row-level security policy for a relation. For more information, see [Combining multiple policies per user](t_rls_combine_policies.md). 
+ FOR DATASHARES

  A clause that determines whether an RLS-protected relation can be accessed over datashares. By default, an RLS-protected relation can’t be accessed over a datashare. An ALTER TABLE ROW LEVEL SECURITY command run with this clause only affects the relation’s datashare accessibility property. The ROW LEVEL SECURITY property isn’t changed. 

   If you make an RLS-protected relation accessible over datashares, the relation doesn’t have row-level security in the consumer-side datashared database. The relation retains its RLS property on the producer side. 

MASKING \$1 ON \$1 OFF \$1 FOR DATASHARES  
A clause that determines whether a DDM-protected relation can be accessed over datashares. By default, a DDM-protected relation can’t be accessed over a datashare. If you make a DDM-protected relation accessible over datashares, the relation won’t have masking protection in the consumer-side datashared database. The relation retains its masking property on the producer side. Only superusers and users or roles that have the `sys:secadmin` role can set the MASKING FOR DATASHARES clause. For more information, see [Dynamic data masking](t_ddm.md). 

## Examples
<a name="r_ALTER_TABLE-examples"></a>

For examples that show how to use the ALTER TABLE command, see the following.
+ [ALTER TABLE examples](r_ALTER_TABLE_examples_basic.md)
+ [ALTER EXTERNAL TABLE examples](r_ALTER_TABLE_external-table.md)
+ [ALTER TABLE ADD and DROP COLUMN examples](r_ALTER_TABLE_COL_ex-add-drop.md)

# ALTER TABLE examples
<a name="r_ALTER_TABLE_examples_basic"></a>

The following examples demonstrate basic usage of the ALTER TABLE command. 

## Rename a table or view
<a name="r_ALTER_TABLE_examples_basic-rename-a-table"></a>

The following command renames the USERS table to USERS\$1BKUP: 

```
alter table users
rename to users_bkup;
```

 You can also use this type of command to rename a view. 

## Change the owner of a table or view
<a name="r_ALTER_TABLE_examples_basic-change-the-owner-of-a-table-or-view"></a>

The following command changes the VENUE table owner to the user DWUSER: 

```
alter table venue
owner to dwuser;
```

The following commands create a view, then change its owner: 

```
create view vdate as select * from date;
alter table vdate owner to vuser;
```

## Rename a column
<a name="r_ALTER_TABLE_examples_basic-rename-a-column"></a>

The following command renames the VENUESEATS column in the VENUE table to VENUESIZE: 

```
alter table venue
rename column venueseats to venuesize;
```

## Drop a table constraint
<a name="r_ALTER_TABLE_examples_drop-constraint"></a>

To drop a table constraint, such as a primary key, foreign key, or unique constraint, first find the internal name of the constraint. Then specify the constraint name in the ALTER TABLE command. The following example finds the constraints for the CATEGORY table, then drops the primary key with the name `category_pkey`. 

```
select constraint_name, constraint_type
from information_schema.table_constraints
where constraint_schema ='public'
and table_name = 'category';

constraint_name | constraint_type
----------------+----------------
category_pkey   | PRIMARY KEY

alter table category
drop constraint category_pkey;
```

## Alter a VARCHAR column
<a name="r_ALTER_TABLE_examples_alter-column"></a>

To conserve storage, you can define a table initially with VARCHAR columns with the minimum size needed for your current data requirements. Later, to accommodate longer strings, you can alter the table to increase the size of the column. 

The following example increases the size of the EVENTNAME column to VARCHAR(300). 

```
alter table event alter column eventname type varchar(300);
```

## Alter a VARBYTE column
<a name="r_ALTER_TABLE_examples_alter-varbyte-column"></a>

To conserve storage, you can define a table initially with VARBYTE columns with the minimum size needed for your current data requirements. Later, to accommodate longer strings, you can alter the table to increase the size of the column. 

The following example increases the size of the EVENTNAME column to VARBYTE(300). 

```
alter table event alter column eventname type varbyte(300);
```

## Alter the compression encoding for a column
<a name="r_ALTER_TABLE_examples_alter-column-encoding"></a>

You can alter the compression encoding of a column. Below, you can find a set of examples demonstrating this approach. The table definition for these examples is as follows.

```
create table t1(c0 int encode lzo, c1 bigint encode zstd, c2 varchar(16) encode lzo, c3 varchar(32) encode zstd);
```

The following statement alters the compression encoding for column c0 from LZO encoding to AZ64 encoding. 

```
alter table t1 alter column c0 encode az64;
```

The following statement alters the compression encoding for column c1 from Zstandard encoding to AZ64 encoding. 

```
alter table t1 alter column c1 encode az64;
```

The following statement alters the compression encoding for column c2 from LZO encoding to Byte-dictionary encoding. 

```
alter table t1 alter column c2 encode bytedict;
```

The following statement alters the compression encoding for column c3 from Zstandard encoding to Runlength encoding. 

```
alter table t1 alter column c3 encode runlength;
```

## Alter a DISTSTYLE KEY DISTKEY column
<a name="r_ALTER_TABLE_examples_alter-distkey"></a>

The following examples show how to change the DISTSTYLE and DISTKEY of a table.

Create a table with an EVEN distribution style. The SVV\$1TABLE\$1INFO view shows that the DISTSTYLE is EVEN. 

```
create table inventory(
  inv_date_sk int4 not null ,
  inv_item_sk int4 not null ,
  inv_warehouse_sk int4 not null ,
  inv_quantity_on_hand int4
) diststyle even;

Insert into inventory values(1,1,1,1);

select "table", "diststyle" from svv_table_info;

   table   |   diststyle
-----------+----------------
 inventory |     EVEN
```

Alter the table DISTKEY to `inv_warehouse_sk`. The SVV\$1TABLE\$1INFO view shows the `inv_warehouse_sk` column as the resulting distribution key. 

```
alter table inventory alter diststyle key distkey inv_warehouse_sk;

select "table", "diststyle" from svv_table_info;

   table   |       diststyle
-----------+-----------------------
 inventory | KEY(inv_warehouse_sk)
```

Alter the table DISTKEY to `inv_item_sk`. The SVV\$1TABLE\$1INFO view shows the `inv_item_sk` column as the resulting distribution key. 

```
alter table inventory alter distkey inv_item_sk;

select "table", "diststyle" from svv_table_info;

   table   |       diststyle
-----------+-----------------------
 inventory | KEY(inv_item_sk)
```

## Alter a table to DISTSTYLE ALL
<a name="r_ALTER_TABLE_examples_alter-diststyle-all"></a>

The following examples show how to change a table to DISTSTYLE ALL.

Create a table with an EVEN distribution style. The SVV\$1TABLE\$1INFO view shows that the DISTSTYLE is EVEN. 

```
create table inventory(
  inv_date_sk int4 not null ,
  inv_item_sk int4 not null ,
  inv_warehouse_sk int4 not null ,
  inv_quantity_on_hand int4
) diststyle even;

Insert into inventory values(1,1,1,1);

select "table", "diststyle" from svv_table_info;

   table   |   diststyle
-----------+----------------
 inventory |     EVEN
```

Alter the table DISTSTYLE to ALL. The SVV\$1TABLE\$1INFO view shows the changed DISTSYTLE. 

```
alter table inventory alter diststyle all;

select "table", "diststyle" from svv_table_info;

   table   |   diststyle
-----------+----------------
 inventory |     ALL
```

## Alter a table SORTKEY
<a name="r_ALTER_TABLE_examples_alter-sortkey"></a>

You can alter a table to have a compound sort key or no sort key.

In the following table definition, table `t1` is defined with an interleaved sortkey.

```
create table t1 (c0 int, c1 int) interleaved sortkey(c0, c1);
```

The following command alters the table from an interleaved sort key to a compound sort key.

```
alter table t1 alter sortkey(c0, c1);
```

The following command alters the table to remove the interleaved sort key.

```
alter table t1 alter sortkey none;
```

In the following table definition, table `t1` is defined with column `c0` as a sortkey.

```
create table t1 (c0 int, c1 int) sortkey(c0);
```

The following command alters the table `t1` to a compound sort key.

```
alter table t1 alter sortkey(c0, c1);
```

## Alter a table to ENCODE AUTO
<a name="r_ALTER_TABLE_examples_alter-encode-auto"></a>

The following example shows how to alter a table to ENCODE AUTO. 

The table definition for this example follows. Column `c0` is defined with the encoding type AZ64, and column `c1` is defined with the encoding type LZO.

```
create table t1(c0 int encode AZ64, c1 varchar encode LZO);
```

For this table, the following statement alters the encoding to AUTO.

```
alter table t1 alter encode auto;
```

The following example shows how to alter a table to remove the ENCODE AUTO setting. 

The table definition for this example follows. The table columns are defined without encoding. In this case, the encoding defaults to ENCODE AUTO.

```
create table t2(c0 int, c1 varchar);
```

For this table, the following statement alters the encoding of column c0 to LZO. The table encoding is no longer set to ENCODE AUTO.

```
alter table t2 alter column c0 encode lzo;;
```

## Alter row-level security control
<a name="r_ALTER_TABLE_examples_basic-rls"></a>

The following command turns RLS off for the table: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY OFF;
```

The following command turns RLS on for the table: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY ON;
```

The following command turns RLS on for the table and makes it accessible over datashares: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY ON;
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY FOR DATASHARES OFF;
```

The following command turns RLS on for the table and makes it inaccessible over datashares: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY ON;
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY FOR DATASHARES ON;
```

The following command turns RLS on and sets RLS conjunction type to OR for the table: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY ON CONJUNCTION TYPE OR;
```

The following command turns RLS on and sets RLS conjunction type to AND for the table: 

```
ALTER TABLE tickit_category_redshift ROW LEVEL SECURITY ON CONJUNCTION TYPE AND;
```

# ALTER EXTERNAL TABLE examples
<a name="r_ALTER_TABLE_external-table"></a>

The following examples use an Amazon S3 bucket located in the US East (N. Virginia) Region (`us-east-1`) AWS Region and the example tables created in [Examples](r_CREATE_EXTERNAL_TABLE_examples.md) for CREATE TABLE. For more information about how to use partitions with external tables, see [Partitioning Redshift Spectrum external tables](c-spectrum-external-tables.md#c-spectrum-external-tables-partitioning). 

The following example sets the numRows table property for the SPECTRUM.SALES external table to 170,000 rows.

```
alter table spectrum.sales
set table properties ('numRows'='170000');
```

The following example changes the location for the SPECTRUM.SALES external table.

```
alter table spectrum.sales
set location 's3://redshift-downloads/tickit/spectrum/sales/';
```

The following example changes the format for the SPECTRUM.SALES external table to Parquet.

```
alter table spectrum.sales
set file format parquet;
```

The following example adds one partition for the table SPECTRUM.SALES\$1PART.

```
alter table spectrum.sales_part
add if not exists partition(saledate='2008-01-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01/';
```

The following example adds three partitions for the table SPECTRUM.SALES\$1PART.

```
alter table spectrum.sales_part add if not exists
partition(saledate='2008-01-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01/'
partition(saledate='2008-02-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02/'
partition(saledate='2008-03-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-03/';
```

The following example alters SPECTRUM.SALES\$1PART to drop the partition with `saledate='2008-01-01''`.

```
alter table spectrum.sales_part
drop partition(saledate='2008-01-01');
```

The following example sets a new Amazon S3 path for the partition with `saledate='2008-01-01'`.

```
alter table spectrum.sales_part
partition(saledate='2008-01-01')
set location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01-01/';
```

The following example changes the name of `sales_date` to `transaction_date`. 

```
alter table spectrum.sales rename column sales_date to transaction_date;
```

The following example sets the column mapping to position mapping for an external table that uses optimized row columnar (ORC) format.

```
alter table spectrum.orc_example
set table properties('orc.schema.resolution'='position');
```

The following example sets the column mapping to name mapping for an external table that uses ORC format.

```
alter table spectrum.orc_example
set table properties('orc.schema.resolution'='name');
```

# ALTER TABLE ADD and DROP COLUMN examples
<a name="r_ALTER_TABLE_COL_ex-add-drop"></a>

The following examples demonstrate how to use ALTER TABLE to add and then drop a basic table column and also how to drop a column with a dependent object. 

## ADD then DROP a basic column
<a name="r_ALTER_TABLE_COL_ex-add-then-drop-a-basic-column"></a>

The following example adds a standalone FEEDBACK\$1SCORE column to the USERS table. This column simply contains an integer, and the default value for this column is NULL (no feedback score). 

First, query the PG\$1TABLE\$1DEF catalog table to view the schema of the USERS table: 

```
column        | type                   | encoding | distkey | sortkey
--------------+------------------------+----------+---------+--------
userid        | integer                | delta    | true    |       1
username      | character(8)           | lzo      | false   |       0
firstname     | character varying(30)  | text32k  | false   |       0
lastname      | character varying(30)  | text32k  | false   |       0
city          | character varying(30)  | text32k  | false   |       0
state         | character(2)           | bytedict | false   |       0
email         | character varying(100) | lzo      | false   |       0
phone         | character(14)          | lzo      | false   |       0
likesports    | boolean                | none     | false   |       0
liketheatre   | boolean                | none     | false   |       0
likeconcerts  | boolean                | none     | false   |       0
likejazz      | boolean                | none     | false   |       0
likeclassical | boolean                | none     | false   |       0
likeopera     | boolean                | none     | false   |       0
likerock      | boolean                | none     | false   |       0
likevegas     | boolean                | none     | false   |       0
likebroadway  | boolean                | none     | false   |       0
likemusicals  | boolean                | none     | false   |       0
```

Now add the feedback\$1score column: 

```
alter table users
add column feedback_score int
default NULL;
```

Select the FEEDBACK\$1SCORE column from USERS to verify that it was added: 

```
select feedback_score from users limit 5;

feedback_score
----------------
NULL
NULL
NULL
NULL
NULL
```

Drop the column to reinstate the original DDL: 

```
alter table users drop column feedback_score;
```

## Dropping a column with a dependent object
<a name="r_ALTER_TABLE_COL_ex-dropping-a-column-with-a-dependent-object"></a>

The following example drops a column that has a dependent object. As a result, the dependent object is also dropped. 

To start, add the FEEDBACK\$1SCORE column to the USERS table again: 

```
alter table users
add column feedback_score int
default NULL;
```

Next, create a view from the USERS table called USERS\$1VIEW: 

```
create view users_view as select * from users;
```

Now, try to drop the FEEDBACK\$1SCORE column from the USERS table. This DROP statement uses the default behavior (RESTRICT): 

```
alter table users drop column feedback_score;
```

Amazon Redshift displays an error message that the column can't be dropped because another object depends on it. 

Try dropping the FEEDBACK\$1SCORE column again, this time specifying CASCADE to drop all dependent objects: 

```
alter table users
drop column feedback_score cascade;
```

# ALTER TABLE APPEND
<a name="r_ALTER_TABLE_APPEND"></a>

Appends rows to a target table by moving data from an existing source table. Data in the source table is moved to matching columns in the target table. Column order doesn't matter. After data is successfully appended to the target table, the source table is empty. ALTER TABLE APPEND is usually much faster than a similar [CREATE TABLE AS](r_CREATE_TABLE_AS.md) or [INSERT](r_INSERT_30.md) INTO operation because data is moved, not duplicated. 

**Note**  
ALTER TABLE APPEND moves data blocks between the source table and the target table. To improve performance, ALTER TABLE APPEND doesn't compact storage as part of the append operation. As a result, storage usage increases temporarily. To reclaim the space, run a [VACUUM](r_VACUUM_command.md) operation.

Columns with the same names must also have identical column attributes. If either the source table or the target table contains columns that don't exist in the other table, use the IGNOREEXTRA or FILLTARGET parameters to specify how extra columns should be managed. 

You can't append an identity column. If both tables include an identity column, the command fails. If only one table has an identity column, include the FILLTARGET or IGNOREEXTRA parameter. For more information, see [ALTER TABLE APPEND usage notes](#r_ALTER_TABLE_APPEND_usage).

You can append a GENERATED BY DEFAULT AS IDENTITY column. You can update columns defined as GENERATED BY DEFAULT AS IDENTITY with values that you supply. For more information, see [ALTER TABLE APPEND usage notes](#r_ALTER_TABLE_APPEND_usage). 

The target table must be a permanent table. However, the source can be a permanent table or a materialized view configured for streaming ingestion. Both objects must use the same distribution style and distribution key, if one is defined. If the objects are sorted, both objects must use the same sort style and define the same columns as sort keys.

An ALTER TABLE APPEND command automatically commits immediately upon completion of the operation. It can't be rolled back. You can't run ALTER TABLE APPEND within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

## Required privileges
<a name="r_ALTER_TABLE_APPEND-privileges"></a>

Depending on the ALTER TABLE APPEND command, one of the following privileges is required:
+ Superuser
+ Users with the ALTER TABLE system privilege
+ Users with DELETE and SELECT privileges on the source table, and INSERT privilege on the target table

## Syntax
<a name="r_ALTER_TABLE_APPEND-synopsis"></a>

```
ALTER TABLE target_table_name APPEND FROM [ source_table_name | source_materialized_view_name ]
[ IGNOREEXTRA | FILLTARGET ]
```

Appending from a materialized view works only in the case where your materialized view is configured for [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

## Parameters
<a name="r_ALTER_TABLE_APPEND-parameters"></a>

 *target\$1table\$1name*   
The name of the table to which rows are appended. Either specify just the name of the table or use the format *schema\$1name.table\$1name* to use a specific schema. The target table must be an existing permanent table.

 FROM *source\$1table\$1name*   
The name of the table that provides the rows to be appended. Either specify just the name of the table or use the format *schema\$1name.table\$1name* to use a specific schema. The source table must be an existing permanent table.

 FROM *source\$1materialized\$1view\$1name*   
The name of a materialized view that provides the rows to be appended. Appending from a materialized view works only in the case where your materialized view is configured for [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md). The source materialized view must already exist. 

IGNOREEXTRA   
A keyword that specifies that if the source table includes columns that are not present in the target table, data in the extra columns should be discarded. You can't use IGNOREEXTRA with FILLTARGET. 

FILLTARGET   
A keyword that specifies that if the target table includes columns that are not present in the source table, the columns should be filled with the [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) column value, if one was defined, or NULL. You can't use IGNOREEXTRA with FILLTARGET. 

## ALTER TABLE APPEND usage notes
<a name="r_ALTER_TABLE_APPEND_usage"></a>
+ ALTER TABLE APPEND moves only identical columns from the source table to the target table. Column order doesn't matter. 
+  If either the source table or the target table contains extra columns, use either FILLTARGET or IGNOREEXTRA according to the following rules: 
  + If the source table contains columns that don't exist in the target table, include IGNOREEXTRA. The command ignores the extra columns in the source table.
  + If the target table contains columns that don't exist in the source table, include FILLTARGET. The command fills the extra columns in the target table with either the default column value or IDENTITY value, if one was defined, or NULL.
  + If both the source table and the target table contain extra columns, the command fails. You can't use both FILLTARGET and IGNOREEXTRA. 
+ If a column with the same name but different attributes exists in both tables, the command fails. Like-named columns must have the following attributes in common: 
  + Data type
  + Column size
  + Compression encoding
  + Not null
  + Sort style
  + Sort key columns
  + Distribution style
  + Distribution key columns
+ You can't append an identity column. If both the source table and the target table have identity columns, the command fails. If only the source table has an identity column, include the IGNOREEXTRA parameter so that the identity column is ignored. If only the target table has an identity column, include the FILLTARGET parameter so that the identity column is populated according to the IDENTITY clause defined for the table. For more information, see [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default). 
+ You can append a default identity column with the ALTER TABLE APPEND statement. For more information, see [CREATE TABLE](r_CREATE_TABLE_NEW.md). 
+ ALTER TABLE APPEND operations hold exclusive locks when run on Amazon Redshift streaming materialized views connected to any of the following:
  +  An Amazon Kinesis data stream 
  +  An Amazon Managed Streaming for Apache Kafka topic 
  +  A supported external stream, such as a Confluent Cloud Kafka topic 

  For more information, see [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

## ALTER TABLE APPEND examples
<a name="r_ALTER_TABLE_APPEND_examples"></a>

Suppose your organization maintains a table, SALES\$1MONTHLY, to capture current sales transactions. You want to move data from the transaction table to the SALES table, every month. 

You can use the following INSERT INTO and TRUNCATE commands to accomplish the task. 

```
insert into sales (select * from sales_monthly);
truncate sales_monthly;
```

However, you can perform the same operation much more efficiently by using an ALTER TABLE APPEND command. 

First, query the [PG\$1TABLE\$1DEF](r_PG_TABLE_DEF.md) system catalog table to verify that both tables have the same columns with identical column attributes. 

```
select trim(tablename) as table, "column", trim(type) as type,
encoding, distkey, sortkey, "notnull"
from pg_table_def where tablename like 'sales%';

table      | column     | type                        | encoding | distkey | sortkey | notnull
-----------+------------+-----------------------------+----------+---------+---------+--------
sales      | salesid    | integer                     | lzo      | false   |       0 | true
sales      | listid     | integer                     | none     | true    |       1 | true
sales      | sellerid   | integer                     | none     | false   |       2 | true
sales      | buyerid    | integer                     | lzo      | false   |       0 | true
sales      | eventid    | integer                     | mostly16 | false   |       0 | true
sales      | dateid     | smallint                    | lzo      | false   |       0 | true
sales      | qtysold    | smallint                    | mostly8  | false   |       0 | true
sales      | pricepaid  | numeric(8,2)                | delta32k | false   |       0 | false
sales      | commission | numeric(8,2)                | delta32k | false   |       0 | false
sales      | saletime   | timestamp without time zone | lzo      | false   |       0 | false
salesmonth | salesid    | integer                     | lzo      | false   |       0 | true
salesmonth | listid     | integer                     | none     | true    |       1 | true
salesmonth | sellerid   | integer                     | none     | false   |       2 | true
salesmonth | buyerid    | integer                     | lzo      | false   |       0 | true
salesmonth | eventid    | integer                     | mostly16 | false   |       0 | true
salesmonth | dateid     | smallint                    | lzo      | false   |       0 | true
salesmonth | qtysold    | smallint                    | mostly8  | false   |       0 | true
salesmonth | pricepaid  | numeric(8,2)                | delta32k | false   |       0 | false
salesmonth | commission | numeric(8,2)                | delta32k | false   |       0 | false
salesmonth | saletime   | timestamp without time zone | lzo      | false   |       0 | false
```

Next, look at the size of each table.

```
select count(*) from sales_monthly;
 count
-------
  2000
(1 row)

select count(*) from sales;
 count
-------
 412,214
(1 row)
```

Now run the following ALTER TABLE APPEND command.

```
alter table sales append from sales_monthly;         
```

Look at the size of each table again. The SALES\$1MONTHLY table now has 0 rows, and the SALES table has grown by 2000 rows.

```
select count(*) from sales_monthly;
 count
-------
     0
(1 row)

select count(*) from sales;
 count
-------
 414214
(1 row)
```

If the source table has more columns than the target table, specify the IGNOREEXTRA parameter. The following example uses the IGNOREEXTRA parameter to ignore extra columns in the SALES\$1LISTING table when appending to the SALES table.

```
alter table sales append from sales_listing ignoreextra;
```

If the target table has more columns than the source table, specify the FILLTARGET parameter. The following example uses the FILLTARGET parameter to populate columns in the SALES\$1REPORT table that don't exist in the SALES\$1MONTH table.

```
alter table sales_report append from sales_month filltarget;
```

The following example shows an example of how to use ALTER TABLE APPEND with a materialized view as a source.

```
ALTER TABLE target_tbl APPEND FROM my_streaming_materialized_view;
```

The table and materialized view names in this example are samples. Appending from a materialized view works only in the case where your materialized view is configured for [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md). It moves all records in the source materialized view to a target table with the same schema as the materialized view and leaves the materialized view intact. This is the same behavior as when the source of the data is a table.

# ALTER TEMPLATE
<a name="r_ALTER_TEMPLATE"></a>

Changes the definition of an existing template. Use this command to rename a template, change the owner of a template, add or remove parameters from the template definition, or set parameter values.

## Required privileges
<a name="r_ALTER_TEMPLATE-privileges"></a>

To alter a template, you must have one of the following:
+ Superuser privileges
+ ALTER TEMPLATE privilege and USAGE privilege on the schema containing the template

## Syntax
<a name="r_ALTER_TEMPLATE-synopsis"></a>

```
ALTER TEMPLATE [database_name.][schema_name.]template_name
{
RENAME TO new_name
| OWNER TO new_owner
| ADD  parameter [AS] [value]
| DROP parameter
| SET parameter TO value1 [, parameter2 TO value2 , ...]
};
```

## Parameters
<a name="r_ALTER_TEMPLATE-parameters"></a>

 *database\$1name*   
(Optional) The name of the database in which the template is created. If not specified, the current database is used. 

 *schema\$1name*   
(Optional) The name of the schema in which the template is created. If not specified, the template is searched for in the current search path. 

 *template\$1name*   
The name of the template to be altered. 

RENAME TO   
A clause that renames the template. 

 *new\$1name*   
The new name of the template. For more information about valid names, see [Names and identifiers](r_names.md). 

OWNER TO   
A clause that changes the owner of the template. 

 *new\$1owner*   
The new owner of the template. 

ADD *parameter* [AS] [*value*]  
Adds a new parameter to the template.  
+ For keyword-only parameters (such as CSV or GZIP), specify just the parameter name.
+ For parameters that require values, specify the parameter name followed by the value. You can optionally include AS between the parameter and value. 

DROP *parameter*  
Removes the specified parameter from the template. Cannot drop multiple parameters with a single DROP command.

SET *parameter* TO *value1* [, *parameter2* TO *value2* , ...]  
Updates the values of existing template parameters. Only use for parameters that already have values. Multiple parameters can be updated in a single command.

## Examples
<a name="r_ALTER_TEMPLATE-examples"></a>

The following example renames the test\$1template template to demo\$1template.

```
ALTER TEMPLATE test_template
RENAME TO demo_template;
```

The following example gives ownership of the demo\$1template schema to the user bob.

```
ALTER TEMPLATE demo_template
OWNER TO bob;
```

The following example adds parameter `CSV` to template demo\$1template

```
ALTER TEMPLATE demo_template
ADD CSV;
```

The following example adds parameter `TIMEFORMAT 'auto'` to template demo\$1template

```
ALTER TEMPLATE demo_template
ADD TIMEFORMAT 'auto';
```

The following example drops parameter `ENCRYPTED` from template demo\$1template

```
ALTER TEMPLATE demo_template
DROP ENCRYPTED;
```

The following example sets the `DELIMITER` parameter to `'|'` and the `TIMEFORMAT` parameter to `'epochsecs'`:

```
ALTER TEMPLATE demo_template
SET DELIMITER TO '|', TIMEFORMAT TO 'epochsecs';
```

# ALTER USER
<a name="r_ALTER_USER"></a>

Changes a database user.

## Required privileges
<a name="r_ALTER_USER-privileges"></a>

Following are required privileges for ALTER USER:
+ Superuser
+ Users with the ALTER USER privilege
+ Current user who wants to change their own password

## Syntax
<a name="r_ALTER_USER-synopsis"></a>

```
ALTER USER username [ WITH ] option [, ... ]

where option is

CREATEDB | NOCREATEDB
| CREATEUSER | NOCREATEUSER
| SYSLOG ACCESS { RESTRICTED | UNRESTRICTED }
| PASSWORD { 'password' | 'md5hash' | 'sha256hash' | DISABLE }
[ VALID UNTIL 'expiration_date' ]
| RENAME TO new_name |
| CONNECTION LIMIT { limit | UNLIMITED }
| SESSION TIMEOUT limit | RESET SESSION TIMEOUT
| SET parameter { TO | = } { value | DEFAULT }
| RESET parameter
| EXTERNALID external_id
```

## Parameters
<a name="r_ALTER_USER-parameters"></a>

 *username*   
Name of the user. 

WITH   
Optional keyword. 

CREATEDB \$1 NOCREATEDB   
The CREATEDB option allows the user to create new databases. NOCREATEDB is the default. 

CREATEUSER \$1 NOCREATEUSER   
The CREATEUSER option creates a superuser with all database privileges, including CREATE USER. The default is NOCREATEUSER. For more information, see [Superusers](r_superusers.md).

SYSLOG ACCESS \$1 RESTRICTED \$1 UNRESTRICTED \$1  <a name="alter-user-syslog-access"></a>
A clause that specifies the level of access that the user has to the Amazon Redshift system tables and views.   
Regular users who have the SYSLOG ACCESS RESTRICTED permission can see only the rows generated by that user in user-visible system tables and views. The default is RESTRICTED.   
Regular users who have the SYSLOG ACCESS UNRESTRICTED permission can see all rows in user-visible system tables and views, including rows generated by another user. UNRESTRICTED doesn't give a regular user access to superuser-visible tables. Only superusers can see superuser-visible tables.   
Giving a user unrestricted access to system tables gives the user visibility to data generated by other users. For example, STL\$1QUERY and STL\$1QUERYTEXT contain the full text of INSERT, UPDATE, and DELETE statements, which might contain sensitive user-generated data. 
All rows in SVV\$1TRANSACTIONS are visible to all users.   
For more information, see [Visibility of data in system tables and views](cm_chap_system-tables.md#c_visibility-of-data).

PASSWORD \$1 '*password*' \$1 '*md5hash*' \$1 '*sha256hash*' \$1 DISABLE \$1  
Sets the user's password.   
By default, users can change their own passwords, unless the password is disabled. To disable a user's password, specify DISABLE. When a user's password is disabled, the password is deleted from the system and the user can log on only using temporary AWS Identity and Access Management (IAM) user credentials. For more information, see [Using IAM authentication to generate database user credentials](https://docs.aws.amazon.com/redshift/latest/mgmt/generating-user-credentials.html). Only a superuser can enable or disable passwords. You can't disable a superuser's password. To enable a password, run ALTER USER and specify a password.  
For details about using the PASSWORD parameter, see [CREATE USER](r_CREATE_USER.md). 

VALID UNTIL '*expiration\$1date*'   
Specifies that the password has an expiration date. Use the value `'infinity'` to avoid having an expiration date. The valid data type for this parameter is timestamp.   
Only superusers can use this parameter.

RENAME TO   
Renames the user. 

 *new\$1name*   
New name of the user. For more information about valid names, see [Names and identifiers](r_names.md).  
When you rename a user, you must also reset the user’s password. The reset password doesn't have to be different from the previous password. The user name is used as part of the password encryption, so when a user is renamed, the password is cleared. The user will not be able to log on until the password is reset. For example:   

```
alter user newuser password 'EXAMPLENewPassword11'; 
```

CONNECTION LIMIT \$1 *limit* \$1 UNLIMITED \$1   
The maximum number of database connections the user is permitted to have open concurrently. The limit isn't enforced for superusers. Use the UNLIMITED keyword to permit the maximum number of concurrent connections. A limit on the number of connections for each database might also apply. For more information, see [CREATE DATABASE](r_CREATE_DATABASE.md). The default is UNLIMITED. To view current connections, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view.  
If both user and database connection limits apply, an unused connection slot must be available that is within both limits when a user attempts to connect.

SESSION TIMEOUT *limit* \$1 RESET SESSION TIMEOUT  
The maximum time in seconds that a session remains inactive or idle. The range is 60 seconds (one minute) to 1,728,000 seconds (20 days). If no session timeout is set for the user, the cluster setting applies. For more information, see [ Quotas and limits in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.  
When you set the session timeout, it's applied to new sessions only.  
To view information about active user sessions, including the start time, user name, and session timeout, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view. To view information about user-session history, query the [STL\$1SESSIONS](r_STL_SESSIONS.md) view. To retrieve information about database users, including session-timeout values, query the [SVL\$1USER\$1INFO](r_SVL_USER_INFO.md) view.

SET   
Sets a configuration parameter to a new default value for all sessions run by the specified user. 

RESET   
Resets a configuration parameter to the original default value for the specified user. 

 *parameter*   
Name of the parameter to set or reset. 

 *value*   
New value of the parameter. 

DEFAULT   
Sets the configuration parameter to the default value for all sessions run by the specified user. 

EXTERNALID *external\$1id*   
The identifier for the user, which is associated with an identity provider. The user must have their password disabled. For more information, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html).

## Usage notes
<a name="r_ALTER_USER_usage_notes"></a>
+ **Attempting to alter rdsdb** – You can't alter the user named `rdsdb`.
+ **Creating an unknown password** – When using AWS Identity and Access Management (IAM) authentication to create database user credentials, you might want to create a superuser that is able to log in only using temporary credentials. You can't disable a superuser's password, but you can create an unknown password using a randomly generated MD5 hash string.

  ```
  alter user iam_superuser password 'md51234567890123456780123456789012';
  ```
+ **Setting search\$1path** – When you set the [search\$1path](r_search_path.md) parameter with the ALTER USER command, the modification takes effect on the specified user's next login. If you want to change the search\$1path value for the current user and session, use a SET command. 
+ **Setting the time zone** – When you use SET TIMEZONE with the ALTER USER command, the modification takes effect on the specified user's next login.
+ **Working with dynamic data masking and row-level security policies** – When your provisioned cluster or serverless namespace has any dynamic data masking or row-level security policies, the following commands are blocked for regular users: 

  ```
  ALTER <current_user> SET enable_case_sensitive_super_attribute/enable_case_sensitive_identifier/downcase_delimited_identifier
  ```

  Only superusers and users with the ALTER USER privilege can set these configuration options. For information on row-level security, see [Row-level security](t_rls.md). For information on dynamic data masking, see [Dynamic data masking](t_ddm.md). 

## Examples
<a name="r_ALTER_USER-examples"></a>

The following example gives the user ADMIN the privilege to create databases: 

```
alter user admin createdb;
```

The following example sets the password of the user ADMIN to `adminPass9` and sets an expiration date and time for the password: 

```
alter user admin password 'adminPass9'
valid until '2017-12-31 23:59';
```

The following example renames the user ADMIN to SYSADMIN: 

```
alter user admin rename to sysadmin;
```

The following example updates the idle-session timeout for a user to 300 seconds.

```
ALTER USER dbuser SESSION TIMEOUT 300;
```

Resets the user's idle-session timeout. When you reset it, the cluster setting applies. You must be a database superuser to run this command. For more information, see [Quotas and limits in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.

```
ALTER USER dbuser RESET SESSION TIMEOUT;
```

The following example updates the external ID for a user named `bob`. The namespace is `myco_aad`. If the namespace isn't associated with a registered identity provider, it results in an error.

```
ALTER USER myco_aad:bob EXTERNALID "ABC123" PASSWORD DISABLE;
```

The following example sets the time zone for all sessions run by a specific database user. It changes the time zone for subsequent sessions, but not for the current session.

```
ALTER USER odie SET TIMEZONE TO 'Europe/Zurich';
```

The following example sets the maximum number of database connections that the user `bob` is allowed to have open.

```
ALTER USER bob CONNECTION LIMIT 10;
```

# ANALYZE
<a name="r_ANALYZE"></a>

Updates table statistics for use by the query planner. 

## Required privileges
<a name="r_ANALYZE-privileges"></a>

Following are required privileges for ANALYZE:
+ Superuser
+ Users with the ANALYZE privilege
+ Owner of the relation
+ Database owner whom the table is shared to

## Syntax
<a name="r_ANALYZE-synopsis"></a>

```
ANALYZE [ VERBOSE ]
[ [ table_name [ ( column_name [, ...] ) ] ]
[ PREDICATE COLUMNS | ALL  COLUMNS ]
```

## Parameters
<a name="r_ANALYZE-parameters"></a>

VERBOSE   
A clause that returns progress information messages about the ANALYZE operation. This option is useful when you don't specify a table.

 *table\$1name*   
You can analyze specific tables, including temporary tables. You can qualify the table with its schema name. You can optionally specify a table\$1name to analyze a single table. You can't specify more than one *table\$1name* with a single ANALYZE *table\$1name* statement. If you don't specify a *table\$1name* value, all of the tables in the currently connected database are analyzed, including the persistent tables in the system catalog. Amazon Redshift skips analyzing a table if the percentage of rows that have changed since the last ANALYZE is lower than the analyze threshold. For more information, see [Analyze threshold](#r_ANALYZE-threshold).  
You don't need to analyze Amazon Redshift system tables (STL and STV tables).

 *column\$1name*   
If you specify a *table\$1name*, you can also specify one or more columns in the table (as a column-separated list within parentheses). If a column list is specified, only the listed columns are analyzed.

 PREDICATE COLUMNS \$1 ALL COLUMNS   
Clauses that indicate whether ANALYZE should include only predicate columns. Specify PREDICATE COLUMNS to analyze only columns that have been used as predicates in previous queries or are likely candidates to be used as predicates. Specify ALL COLUMNS to analyze all columns. The default is ALL COLUMNS.   
A column is included in the set of predicate columns if any of the following is true:  
+ The column has been used in a query as a part of a filter, join condition, or group by clause.
+ The column is a distribution key.
+ The column is part of a sort key.
If no columns are marked as predicate columns, for example because the table has not yet been queried, all of the columns are analyzed even when PREDICATE COLUMNS is specified. When this happens, Amazon Redshift might respond with a message like No predicate columns found for "*table-name*". Analyzing all columns. For more information about predicate columns, see [Analyzing tables](t_Analyzing_tables.md).

## Usage notes
<a name="r_ANALYZE-usage-notes"></a>

Amazon Redshift automatically runs ANALYZE on tables that you create with the following commands: 
+ CREATE TABLE AS
+ CREATE TEMP TABLE AS 
+ SELECT INTO

 You can't analyze an external table.

You don't need to run the ANALYZE command on these tables when they are first created. If you modify them, you should analyze them in the same way as other tables.

### Analyze threshold
<a name="r_ANALYZE-threshold"></a>

To reduce processing time and improve overall system performance, Amazon Redshift skips ANALYZE for a table if the percentage of rows that have changed since the last ANALYZE command run is lower than the analyze threshold specified by the [analyze\$1threshold\$1percent](r_analyze_threshold_percent.md) parameter. By default, `analyze_threshold_percent` is 10. To change `analyze_threshold_percent` for the current session, run the [SET](r_SET.md) command. The following example changes `analyze_threshold_percent` to 20 percent.

```
set analyze_threshold_percent to 20;
```

To analyze tables when only a small number of rows have changed, set `analyze_threshold_percent` to an arbitrarily small number. For example, if you set `analyze_threshold_percent` to 0.01, then a table with 100,000,000 rows aren't skipped if at least 10,000 rows have changed. 

```
set analyze_threshold_percent to 0.01;
```

If ANALYZE skips a table because it doesn't meet the analyze threshold, Amazon Redshift returns the following message.

```
ANALYZE SKIP
```

To analyze all tables even if no rows have changed, set `analyze_threshold_percent` to 0.

To view the results of ANALYZE operations, query the [STL\$1ANALYZE](r_STL_ANALYZE.md) system table. 

For more information about analyzing tables, see [Analyzing tables](t_Analyzing_tables.md).

## Examples
<a name="r_ANALYZE-examples"></a>

Analyze all of the tables in the TICKIT database and return progress information.

```
analyze verbose;
```

Analyze the LISTING table only.

```
analyze listing;
```

Analyze the VENUEID and VENUENAME columns in the VENUE table. 

```
analyze venue(venueid, venuename);
```

Analyze only predicate columns in the VENUE table.

```
analyze venue predicate columns;
```

# ANALYZE COMPRESSION
<a name="r_ANALYZE_COMPRESSION"></a>

Performs compression analysis and produces a report with the suggested compression encoding for the tables analyzed. For each column, the report includes an estimate of the potential reduction in disk space compared to the RAW encoding.

## Syntax
<a name="r_ANALYZE_COMPRESSION-synopsis"></a>

```
ANALYZE COMPRESSION
[ [ table_name ]
[ ( column_name [, ...] ) ] ]
[COMPROWS numrows]
```

## Parameters
<a name="r_ANALYZE_COMPRESSION-parameters"></a>

 *table\$1name*   
You can analyze compression for specific tables, including temporary tables. You can qualify the table with its schema name. You can optionally specify a *table\$1name* to analyze a single table. If you don't specify a *table\$1name*, all of the tables in the currently connected database are analyzed. You can't specify more than one *table\$1name* with a single ANALYZE COMPRESSION statement.

 *column\$1name*   
If you specify a *table\$1name*, you can also specify one or more columns in the table (as a column-separated list within parentheses).

COMPROWS  
Number of rows to be used as the sample size for compression analysis. The analysis is run on rows from each data slice. For example, if you specify COMPROWS 1000000 (1,000,000) and the system contains 4 total slices, no more than 250,000 rows per slice are read and analyzed. If COMPROWS isn't specified, the sample size defaults to 100,000 per slice. Values of COMPROWS lower than the default of 100,000 rows per slice are automatically upgraded to the default value. However, compression analysis doesn't produce recommendations if the amount of data in the table is insufficient to produce a meaningful sample. If the COMPROWS number is greater than the number of rows in the table, the ANALYZE COMPRESSION command still proceeds and runs the compression analysis against all of the available rows. Using COMPROWS results in an error if a table isn't specified.

 *numrows*   
Number of rows to be used as the sample size for compression analysis. The accepted range for *numrows* is a number between 1000 and 1000000000 (1,000,000,000).

## Usage notes
<a name="r_ANALYZE_COMPRESSION_usage_notes"></a>

ANALYZE COMPRESSION acquires an exclusive table lock, which prevents concurrent reads and writes against the table. Only run the ANALYZE COMPRESSION command when the table is idle.

Run ANALYZE COMPRESSION to get recommendations for column encoding schemes, based on a sample of the table's contents. ANALYZE COMPRESSION is an advisory tool and doesn't modify the column encodings of the table. You can apply the suggested encoding by recreating the table or by creating a new table with the same schema. Recreating an uncompressed table with appropriate encoding schemes can significantly reduce its on-disk footprint. This approach saves disk space and improves query performance for I/O-bound workloads.

ANALYZE COMPRESSION skips the actual analysis phase and directly returns the original encoding type on any column that is designated as a SORTKEY. It does this because range-restricted scans might perform poorly when SORTKEY columns are compressed much more highly than other columns.

## Examples
<a name="r_ANALYZE_COMPRESSION-examples"></a>

The following example shows the encoding and estimated percent reduction for the columns in the LISTING table only:

```
analyze compression listing;
  
  Table  |     Column     | Encoding | Est_reduction_pct 
---------+----------------+----------+-------------------
 listing | listid         | az64     | 40.96
 listing | sellerid       | az64     | 46.92
 listing | eventid        | az64     | 53.37
 listing | dateid         | raw      | 0.00
 listing | numtickets     | az64     | 65.66
 listing | priceperticket | az64     | 72.94
 listing | totalprice     | az64     | 68.05
 listing | listtime       | az64     | 49.74
```

The following example analyzes the QTYSOLD, COMMISSION, and SALETIME columns in the SALES table.

```
analyze compression sales(qtysold, commission, saletime);

 Table |   Column   | Encoding | Est_reduction_pct 
-------+------------+----------+-------------------
 sales | salesid    | N/A      | 0.00
 sales | listid     | N/A      | 0.00
 sales | sellerid   | N/A      | 0.00
 sales | buyerid    | N/A      | 0.00
 sales | eventid    | N/A      | 0.00
 sales | dateid     | N/A      | 0.00
 sales | qtysold    | az64     | 83.06
 sales | pricepaid  | N/A      | 0.00
 sales | commission | az64     | 71.85
 sales | saletime   | az64     | 49.63
```

# ATTACH MASKING POLICY
<a name="r_ATTACH_MASKING_POLICY"></a>

Attaches an existing dynamic data masking policy to a column. For more information on dynamic data masking, see [Dynamic data masking](t_ddm.md).

Superusers and users or roles that have the sys:secadmin role can attach a masking policy.

## Syntax
<a name="r_ATTACH_MASKING_POLICY-synopsis"></a>

```
ATTACH MASKING POLICY 
{
  policy_name ON relation_name
  | database_name.policy_name ON database_name.schema_name.relation_name
}
( { output_column_names | output_path } )
[ USING ( { input_column_names | input_path } ) ]
TO { user_name | ROLE role_name | PUBLIC }
[ PRIORITY priority ];
```

## Parameters
<a name="r_ATTACH_MASKING_POLICY-parameters"></a>

*policy\$1name*   
The name of the masking policy to attach.

database\$1name  
The name of the database where the policy and the relation are created. The policy and the relation needs to be on the same database. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

schema\$1name  
The name of the schema the relation belongs to.

 *relation\$1name*   
The name of the relation to attach the masking policy to.

*output\$1column\$1names*   
The names of the columns that the masking policy will apply to.

*output\$1paths*   
The full path of the SUPER object that the masking policy will apply to, including the column name. For example, for a relation with a SUPER type column named `person`, *output\$1path* might be `person.name.first_name`. 

*input\$1column\$1names*   
The names of the columns that the masking policy will take as input. This parameter is optional. If not specified, the masking policy uses *output\$1column\$1names* as inputs.

*input\$1paths*   
The full path of the SUPER object that the masking policy will take as input. This parameter is optional. If not specified, the masking policy uses *output\$1path* for inputs.

*user\$1name*   
The name of the user to whom the masking policy will attach. You can't attach two policies to the same combination of user and column or role and column. You can attach a policy to a user and another policy to the user's role. In this case, the policy with the higher priority applies.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single ATTACH MASKING POLICY command. 

*role\$1name*   
The name of the role to which the masking policy will attach. You can't attach two policies to the same column/role pair. You can attach a policy to a user and another policy to the user's role. In this case, the policy with the higher priority applies.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single ATTACH MASKING POLICY command. 

*PUBLIC*   
Attaches the masking policy to all users accessing the table. You must give other masking policies attached to specific column/user or column/role pairs a higher priority than the PUBLIC policy for them to apply.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single ATTACH MASKING POLICY command. 

*priority*   
The priority of the masking policy. When multiple masking policies apply to a given user's query, the highest priority policy applies.  
You can't attach two different policies to the same column with equal priority, even if the two policies are attached to different users or roles. You can attach the same policy multiple times to the same set of table, output column, input column, and priority parameters, as long as the user or role the policy attaches to is different each time.   
You can't apply a policy to a column with the same priority as another policy attached to that column, even if they're for different roles. This field is optional. If you don't specify a priority, the masking policy defaults to attaching with a priority of 0.

For the usage of ATTACH MASKING POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

# ATTACH RLS POLICY
<a name="r_ATTACH_RLS_POLICY"></a>

Attach a row-level security policy on a table to one or more users or roles.

Superusers and users or roles that have the `sys:secadmin` role can attach a policy.

## Syntax
<a name="r_ATTACH_RLS_POLICY-synopsis"></a>

```
ATTACH RLS POLICY 
{
  policy_name ON [TABLE] table_name [, ...]
  | database_name.policy_name ON [TABLE] database_name.schema_name.table_name [, ...]
}
TO { user_name | ROLE role_name | PUBLIC } [, ...]
```

## Parameters
<a name="r_ATTACH_RLS_POLICY-parameters"></a>

 *policy\$1name*   
The name of the policy.

database\$1name  
The name of the database where the policy and the relation are created. The policy and the relation needs to be on the same database. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

schema\$1name  
The name of the schema the relation belongs to.

table\$1name  
The relation that the row-level security policy is attached to.

TO \$1 *user\$1name* \$1 ROLE *role\$1name* \$1 PUBLIC\$1 [, ...]  
Specifies whether the policy is attached to one or more specified users or roles. 

For the usage of ATTACH RLS POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

## Usage notes
<a name="r_ATTACH_RLS_POLICY-usage"></a>

When working with the ATTACH RLS POLICY statement, observe the following:
+ The table being attached should have all the columns listed in the WITH clause of the policy creation statement.
+ Amazon Redshift RLS supports attaching RLS policies to the following objects:
  +  Tables 
  +  Views
  +  Late-binding views 
  +  Materialized views
+ Amazon Redshift RLS doesn't support attaching RLS policies to the following objects:
  +  Catalog tables 
  +  Cross-database relations 
  +  External tables 
  +  Temporary tables 
  +  Policy lookup tables
  + Materialized view base tables
+ RLS policies that are attached to superusers or to users with the `sys:secadmin` permission are ignored.

## Examples
<a name="r_ATTACH_RLS_POLICY-examples"></a>

The following example attaches an RLS policy to the specified table and role combinations. The RLS policy applies to any users with the role of `analyst` or `dbadmin` when they access the tickit\$1category\$1redshift table.

```
ATTACH RLS POLICY policy_concerts ON tickit_category_redshift TO ROLE analyst, ROLE dbadmin;
```

# BEGIN
<a name="r_BEGIN"></a>

Starts a transaction. Synonymous with START TRANSACTION.

A transaction is a single, logical unit of work, whether it consists of one command or multiple commands. In general, all commands in a transaction run on a snapshot of the database whose starting time is determined by the value set for the `transaction_snapshot_begin` system configuration parameter.

By default, individual Amazon Redshift operations (queries, DDL statements, loads) are automatically committed to the database. If you want to suspend the commit for an operation until subsequent work is completed, you need to open a transaction with the BEGIN statement, then run the required commands, then close the transaction with a [COMMIT](r_COMMIT.md) or [END](r_END.md) statement. If necessary, you can use a [ROLLBACK](r_ROLLBACK.md) statement to stop a transaction that is in progress. An exception to this behavior is the [TRUNCATE](r_TRUNCATE.md) command, which commits the transaction in which it is run and can't be rolled back.

## Syntax
<a name="r_BEGIN-synopsis"></a>

```
BEGIN [ WORK | TRANSACTION ] [ ISOLATION LEVEL option ] [ READ WRITE | READ ONLY ]

START TRANSACTION [ ISOLATION LEVEL option ] [ READ WRITE | READ ONLY ]

Where option is

SERIALIZABLE
| READ UNCOMMITTED
| READ COMMITTED
| REPEATABLE READ

Note: READ UNCOMMITTED, READ COMMITTED, and REPEATABLE READ have no
operational impact and map to SERIALIZABLE in Amazon Redshift. You can see database isolation levels on your cluster 
by querying the stv_db_isolation_level table.
```

## Parameters
<a name="r_BEGIN-parameters"></a>

WORK   
Optional keyword.

TRANSACTION   
Optional keyword; WORK and TRANSACTION are synonyms.

ISOLATION LEVEL SERIALIZABLE   
Serializable isolation is supported by default, so the behavior of the transaction is the same whether or not this syntax is included in the statement. For more information, see [Managing concurrent write operations](c_Concurrent_writes.md). No other isolation levels are supported.  
The SQL standard defines four levels of transaction isolation to prevent *dirty reads* (where a transaction reads data written by a concurrent uncommitted transaction), *nonrepeatable reads* (where a transaction re-reads data it read previously and finds that data was changed by another transaction that committed since the initial read), and *phantom reads* (where a transaction re-runs a query, returns a set of rows that satisfy a search condition, and then finds that the set of rows has changed because of another recently committed transaction):  
+ Read uncommitted: Dirty reads, nonrepeatable reads, and phantom reads are possible.
+ Read committed: Nonrepeatable reads and phantom reads are possible.
+ Repeatable read: Phantom reads are possible.
+ Serializable: Prevents dirty reads, nonrepeatable reads, and phantom reads.
Though you can use any of the four transaction isolation levels, Amazon Redshift processes all isolation levels as serializable.

READ WRITE   
Gives the transaction read and write permissions.

READ ONLY   
Gives the transaction read-only permissions.

## Examples
<a name="r_BEGIN-examples"></a>

The following example starts a serializable transaction block: 

```
begin;
```

The following example starts the transaction block with a serializable isolation level and read and write permissions: 

```
begin read write;
```

# CALL
<a name="r_CALL_procedure"></a>

Runs a stored procedure. The CALL command must include the procedure name and the input argument values. You must call a stored procedure by using the CALL statement.

**Note**  
CALL can't be part of any regular queries.

## Syntax
<a name="r_CALL_procedure-synopsis"></a>

```
CALL sp_name ( [ argument ] [, ...] )
```

## Parameters
<a name="r_CALL_procedure-parameters"></a>

 *sp\$1name*   
The name of the procedure to run. 

 *argument*   
The value of the input argument. This parameter can also be a function name, for example `pg_last_query_id()`. You can't use queries as CALL arguments. 

## Usage notes
<a name="r_CALL_procedure-usage-notes"></a>

Amazon Redshift stored procedures support nested and recursive calls, as described following. In addition, make sure your driver support is up-to-date, also described following.

**Topics**
+ [Nested calls](#r_CALL_procedure-nested-calls)
+ [Driver support](#r_CALL_procedure-driver-support)

### Nested calls
<a name="r_CALL_procedure-nested-calls"></a>

Amazon Redshift stored procedures support nested and recursive calls. The maximum number of nesting levels allowed is 16. Nested calls can encapsulate business logic into smaller procedures, which can be shared by multiple callers. 

If you call a nested procedure that has output parameters, the inner procedure must define INOUT arguments. In this case, the inner procedure is passed in a nonconstant variable. OUT arguments aren't allowed. This behavior occurs because a variable is needed to hold the output of the inner call.

The relationship between inner and outer procedures is logged in the `from_sp_call` column of [SVL\$1STORED\$1PROC\$1CALL](r_SVL_STORED_PROC_CALL.md). 

The following example shows passing variables to a nested procedure call through INOUT arguments.

```
CREATE OR REPLACE PROCEDURE inner_proc(INOUT a int, b int, INOUT c int) LANGUAGE plpgsql
AS $$
BEGIN
  a := b * a;
  c := b * c;
END;
$$;

CREATE OR REPLACE PROCEDURE outer_proc(multiplier int) LANGUAGE plpgsql
AS $$
DECLARE
  x int := 3;
  y int := 4;
BEGIN
  DROP TABLE IF EXISTS test_tbl;
  CREATE TEMP TABLE test_tbl(a int, b varchar(256));
  CALL inner_proc(x, multiplier, y);
  insert into test_tbl values (x, y::varchar);
END;
$$;

CALL outer_proc(5);

SELECT * from test_tbl;
 a  | b
----+----
 15 | 20
(1 row)
```

### Driver support
<a name="r_CALL_procedure-driver-support"></a>

We recommend that you upgrade your Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) drivers to the latest version that has support for Amazon Redshift stored procedures. 

You might be able to use your existing driver if your client tool uses driver API operations that pass through the CALL statement to the server. Output parameters, if any, are returned as a result set of one row. 

The latest versions of Amazon Redshift JDBC and ODBC drivers have metadata support for stored procedure discovery. They also have `CallableStatement` support for custom Java applications. For more information on drivers, see [Connecting to an Amazon Redshift Cluster Using SQL Client Tools](https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-to-cluster.html) in the *Amazon Redshift Management Guide.* 

The following examples show how to use different API operations of the JDBC driver for stored procedure calls.

```
void statement_example(Connection conn) throws SQLException {
  statement.execute("CALL sp_statement_example(1)");
}

void prepared_statement_example(Connection conn) throws SQLException {
  String sql = "CALL sp_prepared_statement_example(42, 84)";
  PreparedStatement pstmt = conn.prepareStatement(sql);
  pstmt.execute();
}

void callable_statement_example(Connection conn) throws SQLException {
  CallableStatement cstmt = conn.prepareCall("CALL sp_create_out_in(?,?)");
  cstmt.registerOutParameter(1, java.sql.Types.INTEGER);
  cstmt.setInt(2, 42);
  cstmt.executeQuery();
  Integer out_value = cstmt.getInt(1);
}
```

## Examples
<a name="r_CALL_procedure-examples"></a>

The following example calls the procedure name `test_spl`.

```
call test_sp1(3,'book');
INFO:  Table "tmp_tbl" does not exist and will be skipped
INFO:  min_val = 3, f2 = book
```

The following example calls the procedure name `test_spl2`.

```
call test_sp2(2,'2019');

         f2          | column2
---------------------+---------
 2019+2019+2019+2019 | 2
(1 row)
```

# CANCEL
<a name="r_CANCEL"></a>

Cancels a database query that is currently running.

The CANCEL command requires the process ID or session ID of the running query and displays a confirmation message to verify that the query was canceled.

## Required privileges
<a name="r_CANCEL-privileges"></a>

Following are required privileges for CANCEL:
+ Superuser canceling their own query
+ Superuser canceling a user's query
+ Users with the CANCEL privilege canceling a user's query
+ User canceling their own query

## Syntax
<a name="r_CANCEL-synopsis"></a>

```
CANCEL process_id [ 'message' ]
```

## Parameters
<a name="r_CANCEL-parameters"></a>

 *process\$1id*   
To cancel a query running in an Amazon Redshift cluster, use the `pid` (Process ID) from [STV\$1RECENTS](r_STV_RECENTS.md) that corresponds to the query that you want to cancel.  
To cancel a query running in an Amazon Redshift Serverless workgroup, use the `session_id` from [SYS\$1QUERY\$1HISTORY](SYS_QUERY_HISTORY.md) that corresponds to the query that you want to cancel.

'*message*'   
An optional confirmation message that displays when the query cancellation completes. If you don't specify a message, Amazon Redshift displays the default message as verification. You must enclose the message in single quotation marks.

## Usage notes
<a name="r_CANCEL-usage-notes"></a>

You can't cancel a query by specifying a *query ID*; you must specify the query's *process ID* (PID) or *Session ID*. You can only cancel queries currently being run by your user. Superusers can cancel all queries.

If queries in multiple sessions hold locks on the same table, you can use the [PG\$1TERMINATE\$1BACKEND](PG_TERMINATE_BACKEND.md) function to terminate one of the sessions. Doing this forces any currently running transactions in the terminated session to release all locks and roll back the transaction. To view currently held locks, query the [STV\$1LOCKS](r_STV_LOCKS.md) system table. 

Following certain internal events, Amazon Redshift might restart an active session and assign a new PID. If the PID has changed, you might receive the following error message.

```
Session <PID> does not exist. The session PID might have changed. Check the stl_restarted_sessions system table for details.
```

To find the new PID, query the [STL\$1RESTARTED\$1SESSIONS](r_STL_RESTARTED_SESSIONS.md) system table and filter on the `oldpid` column.

```
select oldpid, newpid from stl_restarted_sessions where oldpid = 1234;
```

## Examples
<a name="r_CANCEL-examples"></a>

To cancel a currently running query in a Amazon Redshift cluster, first retrieve the process ID for the query that you want to cancel. To determine the process IDs for all currently running queries, type the following command: 

```
select pid, starttime, duration,
trim(user_name) as user,
trim (query) as querytxt
from stv_recents
where status = 'Running';

pid |         starttime          | duration |   user   |    querytxt
-----+----------------------------+----------+----------+-----------------
802 | 2008-10-14 09:19:03.550885 |      132 | dwuser | select
venuename from venue where venuestate='FL', where venuecity not in
('Miami' , 'Orlando');
834 | 2008-10-14 08:33:49.473585 |  1250414 | dwuser | select *
from listing;
964 | 2008-10-14 08:30:43.290527 |   326179 | dwuser | select
sellerid from sales where qtysold in (8, 10);
```

Check the query text to determine which process id (PID) corresponds to the query that you want to cancel.

Type the following command to use PID 802 to cancel that query: 

```
cancel 802;
```

The session where the query was running displays the following message: 

```
ERROR:  Query (168) cancelled on user's request
```

where `168` is the query ID (not the process ID used to cancel the query).

Alternatively, you can specify a custom confirmation message to display instead of the default message. To specify a custom message, include your message in single quotation marks at the end of the CANCEL command: 

```
cancel 802 'Long-running query';
```

The session where the query was running displays the following message: 

```
ERROR:  Long-running query
```

# CLOSE
<a name="close"></a>

(Optional) Closes all of the free resources that are associated with an open cursor. [COMMIT](r_COMMIT.md), [END](r_END.md), and [ROLLBACK](r_ROLLBACK.md) automatically close the cursor, so it isn't necessary to use the CLOSE command to explicitly close the cursor. 

For more information, see [DECLARE](declare.md), [FETCH](fetch.md). 

## Syntax
<a name="close-synopsis"></a>

```
CLOSE cursor
```

## Parameters
<a name="close-parameters"></a>

*cursor*   
Name of the cursor to close. 

## CLOSE example
<a name="close-example"></a>

The following commands close the cursor and perform a commit, which ends the transaction:

```
close movie_cursor;
commit;
```

# COMMENT
<a name="r_COMMENT"></a>

Creates or changes a comment about a database object.

## Syntax
<a name="r_COMMENT-synopsis"></a>

```
COMMENT ON
{
TABLE object_name |
COLUMN object_name.column_name |
CONSTRAINT constraint_name ON table_name |
DATABASE object_name |
VIEW object_name
}
IS 'text' | NULL
```

## Parameters
<a name="r_COMMENT-parameters"></a>

 *object\$1name*   
Name of the database object being commented on. You can add a comment to the following objects:  
+ TABLE
+ COLUMN (also takes a *column\$1name*).
+ CONSTRAINT (also takes a *constraint\$1name* and *table\$1name*).
+ DATABASE
+ VIEW
+ SCHEMA

IS '*text*' \$1 NULL  
The comment text that you want to add or replace for the specified object. The *text* string is data type TEXT. Enclose the comment in single quotation marks. Set the value to NULL to remove the comment text.

 *column\$1name*   
Name of the column being commented on. Parameter of COLUMN. Follows a table specified in `object_name`.

 *constraint\$1name*   
Name of the constraint that is being commented on. Parameter of CONSTRAINT.

 *table\$1name*   
Name of a table containing the constraint. Parameter of CONSTRAINT.

## Usage notes
<a name="r_COMMENT-usage-notes"></a>

You must be a superuser or the owner of a database object to add or update a comment.

Comments on databases may only be applied to the current database. A warning message is displayed if you attempt to comment on a different database. The same warning is displayed for comments on databases that don't exist.

Comments on external tables, external columns, and columns of late binding views are not supported.

## Examples
<a name="r_COMMENT-example"></a>

The following example adds a comment to the SALES table. 

```
COMMENT ON TABLE sales IS 'This table stores tickets sales data';
```

The following example displays the comment on the SALES table. 

```
select obj_description('public.sales'::regclass);

obj_description
-------------------------------------
This table stores tickets sales data
```

The following example removes a comment from the SALES table. 

```
COMMENT ON TABLE sales IS NULL;
```

The following example adds a comment to the EVENTID column of the SALES table. 

```
COMMENT ON COLUMN sales.eventid IS 'Foreign-key reference to the EVENT table.';
```

The following example displays a comment on the EVENTID column (column number 5) of the SALES table. 

```
select col_description( 'public.sales'::regclass, 5::integer );

col_description
-----------------------------------------
Foreign-key reference to the EVENT table.
```

The following example adds a descriptive comment to the EVENT table. 

```
comment on table event is 'Contains listings of individual events.';
```

To view comments, query the PG\$1DESCRIPTION system catalog. The following example returns the description for the EVENT table.

```
select * from pg_catalog.pg_description
where objoid =
(select oid from pg_class where relname = 'event'
and relnamespace =
(select oid from pg_catalog.pg_namespace where nspname = 'public') );

objoid | classoid | objsubid | description
-------+----------+----------+----------------------------------------
116658 |     1259 |        0 | Contains listings of individual events.
```

# COMMIT
<a name="r_COMMIT"></a>

Commits the current transaction to the database. This command makes the database updates from the transaction permanent.

## Syntax
<a name="r_COMMIT-synopsis"></a>

```
COMMIT [ WORK | TRANSACTION ]
```

## Parameters
<a name="r_COMMIT-parameters"></a>

WORK  
Optional keyword. This keyword isn't supported within a stored procedure. 

TRANSACTION  
Optional keyword. WORK and TRANSACTION are synonyms. Neither is supported within a stored procedure. 

For information about using COMMIT within a stored procedure, see [Managing transactions](stored-procedure-transaction-management.md). 

## Examples
<a name="r_COMMIT-examples"></a>

Each of the following examples commits the current transaction to the database:

```
commit;
```

```
commit work;
```

```
commit transaction;
```

# COPY
<a name="r_COPY"></a>


|  | 
| --- |
|  Client-side encryption for COPY and UNLOAD commands will no longer be open to new customers starting April 30, 2025. If you used client-side encryption with COPY and UNLOAD commands in the 12 months before April 30, 2025, you can continue to use client side encryption with COPY or UNLOAD commands until April 30, 2026. After April 30, 2026, you won't be able to use client-side encryption for COPY and UNLOAD. We recommend that you switch to using server-side encryption for COPY and UNLOAD as soon as possible. If you're already using server-side encryption for COPY and UNLOAD, there's no change and you can continue to use it without altering your queries. For more information on encryption for COPY and UNLOAD, see the ENCRYPTED parameter below.  | 

Loads data into a table from data files or from an Amazon DynamoDB table. The files can be located in an Amazon Simple Storage Service (Amazon S3) bucket, an Amazon EMR cluster, or a remote host that is accessed using a Secure Shell (SSH) connection.

**Note**  
Amazon Redshift Spectrum external tables are read-only. You can't COPY to an external table.

The COPY command appends the input data as additional rows to the table.

The maximum size of a single input row from any source is 4 MB.

**Topics**
+ [Required permissions](#r_COPY-permissions)
+ [COPY syntax](#r_COPY-syntax)
+ [Required parameters](#r_COPY-syntax-required-parameters)
+ [Optional parameters](#r_COPY-syntax-overview-optional-parameters)
+ [Usage notes and additional resources for the COPY command](#r_COPY-using-the-copy-command)
+ [COPY command examples](#r_COPY-using-the-copy-command-examples)
+ [COPY JOB](r_COPY-JOB.md)
+ [COPY with TEMPLATE](r_COPY-WITH-TEMPLATE.md)
+ [COPY parameter reference](r_COPY-parameters.md)
+ [Usage notes](r_COPY_usage_notes.md)
+ [COPY examples](r_COPY_command_examples.md)

## Required permissions
<a name="r_COPY-permissions"></a>

To use the COPY command, you must have [INSERT](r_GRANT.md#grant-insert) privilege for the Amazon Redshift table.

## COPY syntax
<a name="r_COPY-syntax"></a>

```
COPY table-name 
[ column-list ]
FROM data_source
authorization
[ [ FORMAT ] [ AS ] data_format ] 
[ parameter [ argument ] [, ... ] ]
```

You can perform a COPY operation with as few as three parameters: a table name, a data source, and authorization to access the data. 

Amazon Redshift extends the functionality of the COPY command to enable you to load data in several data formats from multiple data sources, control access to load data, manage data transformations, and manage the load operation. 

The following sections present the required COPY command parameters, grouping the optional parameters by function. They also describe each parameter and explain how various options work together. You can go directly to a parameter description by using the alphabetical parameter list. 

## Required parameters
<a name="r_COPY-syntax-required-parameters"></a>

The COPY command requires three elements: 
+ [Table Name](#r_COPY-syntax-overview-table-name)
+ [Data Source](#r_COPY-syntax-overview-data-source)
+ [Authorization](#r_COPY-syntax-overview-credentials)

The simplest COPY command uses the following format. 

```
COPY table-name 
FROM data-source
authorization;
```

The following example creates a table named CATDEMO, and then loads the table with sample data from a data file in Amazon S3 named `category_pipe.txt`. 

```
create table catdemo(catid smallint, catgroup varchar(10), catname varchar(10), catdesc varchar(50));
```

In the following example, the data source for the COPY command is a data file named `category_pipe.txt` in the `tickit` folder of an Amazon S3 bucket named `redshift-downloads`. The COPY command is authorized to access the Amazon S3 bucket through an AWS Identity and Access Management (IAM) role. If your cluster has an existing IAM role with permission to access Amazon S3 attached, you can substitute your role's Amazon Resource Name (ARN) in the following COPY command and run it.

```
copy catdemo
from 's3://redshift-downloads/tickit/category_pipe.txt'
iam_role 'arn:aws:iam::<aws-account-id>:role/<role-name>'
region 'us-east-1';
```

For complete instructions on how to use COPY commands to load sample data, including instructions for loading data from other AWS regions, see [Load Sample Data from Amazon S3](https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-create-sample-db.html) in the Amazon Redshift Getting Started Guide.

*table-name*  <a name="r_COPY-syntax-overview-table-name"></a>
The name of the target table for the COPY command. The table must already exist in the database. The table can be temporary or persistent. The COPY command appends the new input data to any existing rows in the table.

FROM *data-source*  <a name="r_COPY-syntax-overview-data-source"></a>
The location of the source data to be loaded into the target table. A manifest file can be specified with some data sources.   
The most commonly used data repository is an Amazon S3 bucket. You can also load from data files located in an Amazon EMR cluster, an Amazon EC2 instance, or a remote host that your cluster can access using an SSH connection, or you can load directly from a DynamoDB table.   
+ [COPY from Amazon S3](copy-parameters-data-source-s3.md)
+ [COPY from Amazon EMR](copy-parameters-data-source-emr.md) 
+ [COPY from remote host (SSH)](copy-parameters-data-source-ssh.md)
+ [COPY from Amazon DynamoDB](copy-parameters-data-source-dynamodb.md)

Authorization  <a name="r_COPY-syntax-overview-credentials"></a>
A clause that indicates the method that your cluster uses for authentication and authorization to access other AWS resources. The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an IAM role that is attached to your cluster or by providing the access key ID and secret access key for an IAM user.   
+ [Authorization parameters](copy-parameters-authorization.md) 
+ [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based) 
+ [Key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based) 

## Optional parameters
<a name="r_COPY-syntax-overview-optional-parameters"></a>

You can optionally specify how COPY maps field data to columns in the target table, define source data attributes to enable the COPY command to correctly read and parse the source data, and manage which operations the COPY command performs during the load process. 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](#r_COPY-syntax-overview-data-format)
+ [Data conversion parameters](#r_COPY-syntax-overview-data-conversion)
+ [Data load operations](#r_COPY-syntax-overview-data-load)

### Column mapping
<a name="r_COPY-syntax-overview-column-mapping"></a>

By default, COPY inserts field values into the target table's columns in the same order as the fields occur in the data files. If the default column order will not work, you can specify a column list or use JSONPath expressions to map source data fields to the target columns. 
+ [Column List](copy-parameters-column-mapping.md#copy-column-list)
+ [JSONPaths File](copy-parameters-column-mapping.md#copy-column-mapping-jsonpaths)

### Data format parameters
<a name="r_COPY-syntax-overview-data-format"></a>

You can load data from text files in fixed-width, character-delimited, comma-separated values (CSV), or JSON format, or from Avro files.

By default, the COPY command expects the source data to be in character-delimited UTF-8 text files. The default delimiter is a pipe character ( \$1 ). If the source data is in another format, use the following parameters to specify the data format.
+ [FORMAT](copy-parameters-data-format.md#copy-format)
+ [CSV](copy-parameters-data-format.md#copy-csv)
+ [DELIMITER](copy-parameters-data-format.md#copy-delimiter) 
+ [FIXEDWIDTH](copy-parameters-data-format.md#copy-fixedwidth) 
+ [SHAPEFILE](copy-parameters-data-format.md#copy-shapefile) 
+ [AVRO](copy-parameters-data-format.md#copy-avro) 
+ [JSON format for COPY](copy-parameters-data-format.md#copy-json) 
+ [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) 
+ [BZIP2](copy-parameters-file-compression.md#copy-bzip2) 
+ [GZIP](copy-parameters-file-compression.md#copy-gzip) 
+ [LZOP](copy-parameters-file-compression.md#copy-lzop) 
+ [PARQUET](copy-parameters-data-format.md#copy-parquet) 
+ [ORC](copy-parameters-data-format.md#copy-orc) 
+ [ZSTD](copy-parameters-file-compression.md#copy-zstd) 

### Data conversion parameters
<a name="r_COPY-syntax-overview-data-conversion"></a>

As it loads the table, COPY attempts to implicitly convert the strings in the source data to the data type of the target column. If you need to specify a conversion that is different from the default behavior, or if the default conversion results in errors, you can manage data conversions by specifying the following parameters.
+ [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate) 
+ [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars) 
+ [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull) 
+ [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat) 
+ [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull) 
+ [ENCODING](copy-parameters-data-conversion.md#copy-encoding) 
+ [ESCAPE](copy-parameters-data-conversion.md#copy-escape) 
+ [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids) 
+ [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord) 
+ [IGNOREBLANKLINES](copy-parameters-data-conversion.md#copy-ignoreblanklines) 
+ [IGNOREHEADER](copy-parameters-data-conversion.md#copy-ignoreheader) 
+ [NULL AS](copy-parameters-data-conversion.md#copy-null-as) 
+ [REMOVEQUOTES](copy-parameters-data-conversion.md#copy-removequotes) 
+ [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) 
+ [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat) 
+ [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks) 
+ [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns) 

### Data load operations
<a name="r_COPY-syntax-overview-data-load"></a>

Manage the default behavior of the load operation for troubleshooting or to reduce load times by specifying the following parameters. 
+ [COMPROWS](copy-parameters-data-load.md#copy-comprows) 
+ [COMPUPDATE](copy-parameters-data-load.md#copy-compupdate) 
+ [IGNOREALLERRORS](copy-parameters-data-load.md#copy-ignoreallerrors) 
+ [MAXERROR](copy-parameters-data-load.md#copy-maxerror) 
+ [NOLOAD](copy-parameters-data-load.md#copy-noload) 
+ [STATUPDATE](copy-parameters-data-load.md#copy-statupdate) 

## Usage notes and additional resources for the COPY command
<a name="r_COPY-using-the-copy-command"></a>

For more information about how to use the COPY command, see the following topics: 
+ [Usage notes](r_COPY_usage_notes.md)
+ [Tutorial: Loading data from Amazon S3](tutorial-loading-data.md)
+ [Amazon Redshift best practices for loading data](c_loading-data-best-practices.md)
+ [Loading tables with the COPY command](t_Loading_tables_with_the_COPY_command.md)
  + [Loading data from Amazon S3](t_Loading-data-from-S3.md)
  + [Loading data from Amazon EMR](loading-data-from-emr.md)
  + [Loading data from remote hosts](loading-data-from-remote-hosts.md) 
  + [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md)
+ [Troubleshooting data loads](t_Troubleshooting_load_errors.md)

## COPY command examples
<a name="r_COPY-using-the-copy-command-examples"></a>

For more examples that show how to COPY from various sources, in disparate formats, and with different COPY options, see [COPY examples](r_COPY_command_examples.md).

# COPY JOB
<a name="r_COPY-JOB"></a>

For information about using this command, see [Create an S3 event integration to automatically copy files from Amazon S3 buckets](loading-data-copy-job.md).

Manages COPY commands that load data into a table. The COPY JOB command is an extension of the COPY command and automates data loading from Amazon S3 buckets. When you create a COPY job, Amazon Redshift detects when new Amazon S3 files are created in a specified path, and then loads them automatically without your intervention. The same parameters that are used in the original COPY command are used when loading the data. Amazon Redshift keeps track of the loaded files (based on filename) to verify that they are loaded only one time.

**Note**  
For information about the COPY command, including usage, parameters, and permissions, see [COPY](r_COPY.md).

## Required permission
<a name="r_COPY-JOB-privileges"></a>

To use the COPY JOB command, you must have one of the following permissions in addition to all of the required permissions to use COPY:
+ Superuser
+  All of the following: 
  +  The relevant CREATE, ALTER, or DROP scoped permission for COPY JOBS in the database you want to COPY to. 
  +  USAGE permission for the schema you want to COPY to, or USAGE scoped permission for schemas in the database you want to COPY to. 
  +  INSERT permission for the table you want to COPY to, or INSERT scoped permission for tables in the schema or database you want to COPY to. 

The IAM role specified with the COPY command must have permission to access the data to load. For more information, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions).

## Syntax
<a name="r_COPY-JOB-syntax"></a>

Create a copy job. The parameters of the COPY command are saved with the copy job.

You can't run COPY JOB CREATE within the scope of a transaction block.

```
COPY copy-command JOB CREATE job-name
[AUTO ON | OFF]
```

Change the configuration of a copy job.

```
COPY JOB ALTER job-name
[AUTO ON | OFF]
```

Run a copy job. The stored COPY command parameters are used.

```
COPY JOB RUN job-name
```

List all copy jobs.

```
COPY JOB LIST
```

Show the details of a copy job.

```
COPY JOB SHOW job-name
```

Delete a copy job.

You can't run COPY JOB DROP within the scope of a transaction block.

```
COPY JOB DROP job-name
```

## Parameters
<a name="r_COPY-JOB-parameters"></a>

*copy-command*  
A COPY command that loads data from Amazon S3 to Amazon Redshift. The clause contains COPY parameters that define the Amazon S3 bucket, target table, IAM role, and other parameters used when loading data. All COPY command parameters for an Amazon S3 data load are supported except:  
+ The COPY JOB does not ingest preexisting files in the folder pointed to by the COPY command. Only files created after the COPY JOB creation timestamp are ingested.
+ You cannot specify a COPY command with the MAXERROR or IGNOREALLERRORS options.
+ You cannot specify a manifest file. COPY JOB requires a designated Amazon S3 location to monitor for newly created files.
+ You cannot specify a COPY command with authorization types like Access and Secret keys. Only COPY commands that use the `IAM_ROLE` parameter for authorization are supported. For more information, see [Authorization parameters](copy-parameters-authorization.md).
+ The COPY JOB doesn't support the default IAM role associated with the cluster. You must specify the `IAM_ROLE` in the COPY command. 
For more information, see [COPY from Amazon S3](copy-parameters-data-source-s3.md).

*job-name*  
The name of the job used to reference the COPY job. The *job-name* can't contain a hyphen (‐).

 [AUTO ON \$1 OFF]   
Clause that indicates whether Amazon S3 data is automatically loaded into Amazon Redshift tables.  
+ When `ON`, Amazon Redshift monitors the source Amazon S3 path for newly created files, and if found, a COPY command is run with the COPY parameters in the job definition. This is the default.
+ When `OFF`, Amazon Redshift does not run the COPY JOB automatically.

## Usage notes
<a name="r_COPY-JOB-usage-notes"></a>

The options of the COPY command aren't validated until run time. For example, an invalid `IAM_ROLE` or an Amazon S3 data source results in runtime errors when the COPY JOB starts.

If the cluster is paused, COPY JOBS are not run.

To query COPY command files loaded and load errors, see [STL\$1LOAD\$1COMMITS](r_STL_LOAD_COMMITS.md), [STL\$1LOAD\$1ERRORS](r_STL_LOAD_ERRORS.md), [STL\$1LOADERROR\$1DETAIL](r_STL_LOADERROR_DETAIL.md). For more information, see [Verifying that the data loaded correctly](verifying-that-data-loaded-correctly.md).

COPY JOBS are not supported on zero-ETL databases as they operate in read-only mode.

## Examples
<a name="r_COPY-JOB-examples"></a>

The following example shows creating a COPY JOB to load data from an Amazon S3 bucket. 

```
COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName' 
JOB CREATE my_copy_job_name
AUTO ON;
```

# COPY with TEMPLATE
<a name="r_COPY-WITH-TEMPLATE"></a>

You can use Redshift templates with COPY commands to simplify command syntax and ensure consistency across data loading operations. Instead of specifying the same formatting parameters repeatedly, you define them once in a template and reference the template in your COPY commands. When you use a template, the COPY command combines the parameters from the template with any parameters specified directly in the command. If the same parameter appears in both the template and the command, the command parameter takes precedence. For more information, see [CREATE TEMPLATE](r_CREATE_TEMPLATE.md). 

Templates for the COPY command can be created with:
+ [Data format parameters](copy-parameters-data-format.md)
+ [File compression parameters](copy-parameters-file-compression.md)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

For a complete list of supported parameters, see [COPY](r_COPY.md) command.

## Required permission
<a name="r_COPY-WITH-TEMPLATE-privileges"></a>

To use a template in a COPY command, you must have:
+ All required permissions to execute the COPY command (see [Required permissions](r_COPY.md#r_COPY-permissions) )
+ One of the following template permissions:
  + Superuser privileges
  + USAGE privilege on the template and USAGE privilege on the schema containing the template

## Syntax
<a name="r_COPY-WITH-TEMPLATE-syntax"></a>

```
COPY target_table FROM 's3://...'
authorization
[ option, ...]
USING TEMPLATE [database_name.][schema_name.]template_name;
```

## Parameters
<a name="r_COPY-WITH-TEMPLATE-parameters"></a>

 *database\$1name*   
(Optional) The name of the database where the template exists. If not specified, the current database is used.

 *schema\$1name*   
(Optional) The name of the schema where the template exists. If not specified, the template is searched for in the current search path.

 *template\$1name*   
The name of the template to use in COPY. 

## Usage notes
<a name="r_COPY-WITH_TEMPLATE-usage-notes"></a>
+ Command-specific parameters (source, destination, authorization) must still be specified in the COPY command.
+ Templates cannot contain manifest file specifications for COPY commands.

## Examples
<a name="r_COPY-WITH-TEMPLATE-examples"></a>

The following examples show how to create a template and use it in COPY commands:

```
CREATE TEMPLATE public.test_template FOR COPY AS
CSV DELIMITER '|' IGNOREHEADER 1 MAXERROR 100;

COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName'
USING TEMPLATE public.test_template;
```

When a parameter exists in both the template and the command, the command parameter takes precedence. In this example, if the template `public.test_template` contains `DELIMITER '|'` but the COPY command specifies `DELIMITER ','`, the comma delimiter (`,`) from the command will be used instead of the pipe delimiter (`|`) from the template. 

```
COPY public.target_table
FROM 's3://amzn-s3-demo-bucket/staging-folder'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyLoadRoleName'
DELIMITER ','
USING TEMPLATE public.test_template;
```

# COPY parameter reference
<a name="r_COPY-parameters"></a>

COPY has many parameters that can be used in many situations. However, not all parameters are supported in each situation. For example, to load from ORC or PARQUET files there is a limited number of supported parameters. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md).

**Topics**
+ [Data sources](copy-parameters-data-source.md)
+ [Authorization parameters](copy-parameters-authorization.md)
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md)
+ [File compression parameters](copy-parameters-file-compression.md)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)
+ [Alphabetical parameter list](r_COPY-alphabetical-parm-list.md)

# Data sources
<a name="copy-parameters-data-source"></a>

You can load data from text files in an Amazon S3 bucket, in an Amazon EMR cluster, or on a remote host that your cluster can access using an SSH connection. You can also load data directly from a DynamoDB table. 

The maximum size of a single input row from any source is 4 MB. 

To export data from a table to a set of files in an Amazon S3, use the [UNLOAD](r_UNLOAD.md) command. 

**Topics**
+ [COPY from Amazon S3](copy-parameters-data-source-s3.md)
+ [COPY from Amazon EMR](copy-parameters-data-source-emr.md)
+ [COPY from remote host (SSH)](copy-parameters-data-source-ssh.md)
+ [COPY from Amazon DynamoDB](copy-parameters-data-source-dynamodb.md)

# COPY from Amazon S3
<a name="copy-parameters-data-source-s3"></a>

To load data from files located in one or more S3 buckets, use the FROM clause to indicate how COPY locates the files in Amazon S3. You can provide the object path to the data files as part of the FROM clause, or you can provide the location of a manifest file that contains a list of Amazon S3 object paths. COPY from Amazon S3 uses an HTTPS connection. Ensure that the S3 IP ranges are added to your allow list. To learn more about the required S3 IP ranges, see [ Network isolation](https://docs.aws.amazon.com//redshift/latest/mgmt/security-network-isolation.html#network-isolation).

**Important**  
If the Amazon S3 buckets that hold the data files don't reside in the same AWS Region as your cluster, you must use the [REGION](#copy-region) parameter to specify the Region in which the data is located. 

**Topics**
+ [Syntax](#copy-parameters-data-source-s3-syntax)
+ [Examples](#copy-parameters-data-source-s3-examples)
+ [Optional parameters](#copy-parameters-data-source-s3-optional-parms)
+ [Unsupported parameters](#copy-parameters-data-source-s3-unsupported-parms)

## Syntax
<a name="copy-parameters-data-source-s3-syntax"></a>

```
FROM { 's3://objectpath' | 's3://manifest_file' }
authorization
| MANIFEST
| ENCRYPTED
| REGION [AS] 'aws-region'
| optional-parameters
```

## Examples
<a name="copy-parameters-data-source-s3-examples"></a>

The following example uses an object path to load data from Amazon S3. 

```
copy customer
from 's3://amzn-s3-demo-bucket/customer' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The following example uses a manifest file to load data from Amazon S3. 

```
copy customer
from 's3://amzn-s3-demo-bucket/cust.manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest;
```

### Parameters
<a name="copy-parameters-data-source-s3-parameters"></a>

FROM  <a name="copy-parameters-from"></a>
The source of the data to be loaded. For more information about the encoding of the Amazon S3 file, see [Data conversion parameters](copy-parameters-data-conversion.md).

's3://*copy\$1from\$1s3\$1objectpath*'   <a name="copy-s3-objectpath"></a>
Specifies the path to the Amazon S3 objects that contain the data—for example, `'s3://amzn-s3-demo-bucket/custdata.txt'`. The *s3://copy\$1from\$1s3\$1objectpath* parameter can reference a single file or a set of objects or folders that have the same key prefix. For example, the name `custdata.txt` is a key prefix that refers to a number of physical files: `custdata.txt`,`custdata.txt.1`, `custdata.txt.2`, `custdata.txt.bak`,and so on. The key prefix can also reference a number of folders. For example, `'s3://amzn-s3-demo-bucket/custfolder'` refers to the folders `custfolder`, `custfolder_1`, `custfolder_2`, and so on. If a key prefix references multiple folders, all of the files in the folders are loaded. If a key prefix matches a file as well as a folder, such as `custfolder.log`, COPY attempts to load the file also. If a key prefix might result in COPY attempting to load unwanted files, use a manifest file. For more information, see [copy_from_s3_manifest_file](#copy-manifest-file), following.   
If the S3 bucket that holds the data files doesn't reside in the same AWS Region as your cluster, you must use the [REGION](#copy-region) parameter to specify the Region in which the data is located.
For more information, see [Loading data from Amazon S3](t_Loading-data-from-S3.md).

's3://*copy\$1from\$1s3\$1manifest\$1file*'   <a name="copy-manifest-file"></a>
Specifies the Amazon S3 object key for a manifest file that lists the data files to be loaded. The *'s3://*copy\$1from\$1s3\$1manifest\$1file'** argument must explicitly reference a single file—for example, `'s3://amzn-s3-demo-bucket/manifest.txt'`. It can't reference a key prefix.  
The manifest is a text file in JSON format that lists the URL of each file that is to be loaded from Amazon S3. The URL includes the bucket name and full object path for the file. The files that are specified in the manifest can be in different buckets, but all the buckets must be in the same AWS Region as the Amazon Redshift cluster. If a file is listed twice, the file is loaded twice. The following example shows the JSON for a manifest that loads three files.   

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata.1","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket1/custdata.2","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket2/custdata.1","mandatory":false}
  ]
}
```
The double quotation mark characters are required, and must be simple quotation marks (0x22), not slanted or "smart" quotation marks. Each entry in the manifest can optionally include a `mandatory` flag. If `mandatory` is set to `true`, COPY terminates if it doesn't find the file for that entry; otherwise, COPY will continue. The default value for `mandatory` is `false`.   
When loading from data files in ORC or Parquet format, a `meta` field is required, as shown in the following example.  

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket1/orc/2013-10-04-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      },
      {  
         "url":"s3://amzn-s3-demo-bucket2/orc/2013-10-05-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      }
   ]
}
```
The manifest file must not be encrypted or compressed, even if the ENCRYPTED, GZIP, LZOP, BZIP2, or ZSTD options are specified. COPY returns an error if the specified manifest file isn't found or the manifest file isn't properly formed.   
If a manifest file is used, the MANIFEST parameter must be specified with the COPY command. If the MANIFEST parameter isn't specified, COPY assumes that the file specified with FROM is a data file.   
For more information, see [Loading data from Amazon S3](t_Loading-data-from-S3.md).

*authorization*  
The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an AWS Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

MANIFEST  <a name="copy-manifest"></a>
Specifies that a manifest is used to identify the data files to be loaded from Amazon S3. If the MANIFEST parameter is used, COPY loads data from the files listed in the manifest referenced by *'s3://copy\$1from\$1s3\$1manifest\$1file'*. If the manifest file isn't found, or isn't properly formed, COPY fails. For more information, see [Using a manifest to specify data files](loading-data-files-using-manifest.md).

ENCRYPTED  <a name="copy-encrypted"></a>
A clause that specifies that the input files on Amazon S3 are encrypted using client-side encryption with customer managed keys. For more information, see [Loading encrypted data files from Amazon S3](c_loading-encrypted-files.md). Don't specify ENCRYPTED if the input files are encrypted using Amazon S3 server-side encryption (SSE-KMS or SSE-S3). COPY reads server-side encrypted files automatically.  
If you specify the ENCRYPTED parameter, you must also specify the [MASTER_SYMMETRIC_KEY](#copy-master-symmetric-key) parameter or include the **master\$1symmetric\$1key** value in the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) string.  
If the encrypted files are in compressed format, add the GZIP, LZOP, BZIP2, or ZSTD parameter.  
Manifest files and JSONPaths files must not be encrypted, even if the ENCRYPTED option is specified.

MASTER\$1SYMMETRIC\$1KEY '*root\$1key*'  <a name="copy-master-symmetric-key"></a>
The root symmetric key that was used to encrypt data files on Amazon S3. If MASTER\$1SYMMETRIC\$1KEY is specified, the [ENCRYPTED](#copy-encrypted) parameter must also be specified. MASTER\$1SYMMETRIC\$1KEY can't be used with the CREDENTIALS parameter. For more information, see [Loading encrypted data files from Amazon S3](c_loading-encrypted-files.md).  
If the encrypted files are in compressed format, add the GZIP, LZOP, BZIP2, or ZSTD parameter.

REGION [AS] '*aws-region*'  <a name="copy-region"></a>
Specifies the AWS Region where the source data is located. REGION is required for COPY from an Amazon S3 bucket or an DynamoDB table when the AWS resource that contains the data isn't in the same Region as the Amazon Redshift cluster.   
The value for *aws\$1region* must match a Region listed in the [Amazon Redshift regions and endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html#redshift_region) table.  
If the REGION parameter is specified, all resources, including a manifest file or multiple Amazon S3 buckets, must be located in the specified Region.   
Transferring data across Regions incurs additional charges against the Amazon S3 bucket or the DynamoDB table that contains the data. For more information about pricing, see **Data Transfer OUT From Amazon S3 To Another AWS Region** on the [Amazon S3 Pricing](https://aws.amazon.com/s3/pricing/) page and **Data Transfer OUT** on the [Amazon DynamoDB Pricing](https://aws.amazon.com/dynamodb/pricing/) page. 
By default, COPY assumes that the data is located in the same Region as the Amazon Redshift cluster. 

## Optional parameters
<a name="copy-parameters-data-source-s3-optional-parms"></a>

You can optionally specify the following parameters with COPY from Amazon S3: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters
<a name="copy-parameters-data-source-s3-unsupported-parms"></a>

You can't use the following parameters with COPY from Amazon S3: 
+ SSH
+ READRATIO

# COPY from Amazon EMR
<a name="copy-parameters-data-source-emr"></a>

You can use the COPY command to load data in parallel from an Amazon EMR cluster configured to write text files to the cluster's Hadoop Distributed File System (HDFS) in the form of fixed-width files, character-delimited files, CSV files, JSON-formatted files, or Avro files.

**Topics**
+ [Syntax](#copy-parameters-data-source-emr-syntax)
+ [Example](#copy-parameters-data-source-emr-example)
+ [Parameters](#copy-parameters-data-source-emr-parameters)
+ [Supported parameters](#copy-parameters-data-source-emr-optional-parms)
+ [Unsupported parameters](#copy-parameters-data-source-emr-unsupported-parms)

## Syntax
<a name="copy-parameters-data-source-emr-syntax"></a>

```
FROM 'emr://emr_cluster_id/hdfs_filepath'  
authorization
[ optional_parameters ]
```

## Example
<a name="copy-parameters-data-source-emr-example"></a>

The following example loads data from an Amazon EMR cluster. 

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/part-*' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Parameters
<a name="copy-parameters-data-source-emr-parameters"></a>

FROM  
The source of the data to be loaded. 

 'emr://*emr\$1cluster\$1id*/*hdfs\$1file\$1path*'  <a name="copy-emr"></a>
The unique identifier for the Amazon EMR cluster and the HDFS file path that references the data files for the COPY command. The HDFS data file names must not contain the wildcard characters asterisk (\$1) and question mark (?).   
The Amazon EMR cluster must continue running until the COPY operation completes. If any of the HDFS data files are changed or deleted before the COPY operation completes, you might have unexpected results, or the COPY operation might fail. 
You can use the wildcard characters asterisk (\$1) and question mark (?) as part of the *hdfs\$1file\$1path* argument to specify multiple files to be loaded. For example, `'emr://j-SAMPLE2B500FC/myoutput/part*'` identifies the files `part-0000`, `part-0001`, and so on. If the file path doesn't contain wildcard characters, it is treated as a string literal. If you specify only a folder name, COPY attempts to load all files in the folder.   
If you use wildcard characters or use only the folder name, verify that no unwanted files will be loaded. For example, some processes might write a log file to the output folder.
For more information, see [Loading data from Amazon EMR](loading-data-from-emr.md).

*authorization*  
The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an AWS Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

## Supported parameters
<a name="copy-parameters-data-source-emr-optional-parms"></a>

You can optionally specify the following parameters with COPY from Amazon EMR: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters
<a name="copy-parameters-data-source-emr-unsupported-parms"></a>

You can't use the following parameters with COPY from Amazon EMR: 
+ ENCRYPTED
+ MANIFEST
+ REGION
+ READRATIO
+ SSH

# COPY from remote host (SSH)
<a name="copy-parameters-data-source-ssh"></a>

You can use the COPY command to load data in parallel from one or more remote hosts, such Amazon Elastic Compute Cloud (Amazon EC2) instances or other computers. COPY connects to the remote hosts using Secure Shell (SSH) and runs commands on the remote hosts to generate text output. The remote host can be an EC2 Linux instance or another Unix or Linux computer configured to accept SSH connections. Amazon Redshift can connect to multiple hosts, and can open multiple SSH connections to each host. Amazon Redshift sends a unique command through each connection to generate text output to the host's standard output, which Amazon Redshift then reads as it does a text file.

Use the FROM clause to specify the Amazon S3 object key for the manifest file that provides the information COPY uses to open SSH connections and run the remote commands. 

**Topics**
+ [Syntax](#copy-parameters-data-source-ssh-syntax)
+ [Examples](#copy-parameters-data-source-ssh-examples)
+ [Parameters](#copy-parameters-data-source-ssh-parameters)
+ [Optional parameters](#copy-parameters-data-source-ssh-optional-parms)
+ [Unsupported parameters](#copy-parameters-data-source-ssh-unsupported-parms)

**Important**  
 If the S3 bucket that holds the manifest file doesn't reside in the same AWS Region as your cluster, you must use the REGION parameter to specify the Region in which the bucket is located. 

## Syntax
<a name="copy-parameters-data-source-ssh-syntax"></a>

```
FROM 's3://'ssh_manifest_file' }
authorization
SSH
| optional-parameters
```

## Examples
<a name="copy-parameters-data-source-ssh-examples"></a>

The following example uses a manifest file to load data from a remote host using SSH. 

```
copy sales
from 's3://amzn-s3-demo-bucket/ssh_manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
ssh;
```

## Parameters
<a name="copy-parameters-data-source-ssh-parameters"></a>

FROM  
The source of the data to be loaded. 

's3://*copy\$1from\$1ssh\$1manifest\$1file*'  <a name="copy-ssh-manifest"></a>
The COPY command can connect to multiple hosts using SSH, and can create multiple SSH connections to each host. COPY runs a command through each host connection, and then loads the output from the commands in parallel into the table. The *s3://copy\$1from\$1ssh\$1manifest\$1file* argument specifies the Amazon S3 object key for the manifest file that provides the information COPY uses to open SSH connections and run the remote commands.  
The *s3://copy\$1from\$1ssh\$1manifest\$1file* argument must explicitly reference a single file; it can't be a key prefix. The following shows an example:  

```
's3://amzn-s3-demo-bucket/ssh_manifest.txt'
```
The manifest file is a text file in JSON format that Amazon Redshift uses to connect to the host. The manifest file specifies the SSH host endpoints and the commands that will be run on the hosts to return data to Amazon Redshift. Optionally, you can include the host public key, the login user name, and a mandatory flag for each entry. The following example shows a manifest file that creates two SSH connections:   

```
{ 
    "entries": [ 
	    {"endpoint":"<ssh_endpoint_or_IP>", 
           "command": "<remote_command>",
           "mandatory":true, 
           "publickey": "<public_key>", 
           "username": "<host_user_name>"}, 
	    {"endpoint":"<ssh_endpoint_or_IP>", 
           "command": "<remote_command>",
           "mandatory":true, 
           "publickey": "<public_key>", 
           "username": "<host_user_name>"} 
     ] 
}
```
The manifest file contains one `"entries"` construct for each SSH connection. You can have multiple connections to a single host or multiple connections to multiple hosts. The double quotation mark characters are required as shown, both for the field names and the values. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks. The only value that doesn't need double quotation mark characters is the Boolean value `true` or `false` for the `"mandatory"` field.   
The following list describes the fields in the manifest file.     
endpoint  <a name="copy-ssh-manifest-endpoint"></a>
The URL address or IP address of the host—for example, `"ec2-111-222-333.compute-1.amazonaws.com"`, or `"198.51.100.0"`.   
command  <a name="copy-ssh-manifest-command"></a>
The command to be run by the host to generate text output or binary output in gzip, lzop, bzip2, or zstd format. The command can be any command that the user *"host\$1user\$1name"* has permission to run. The command can be as simple as printing a file, or it can query a database or launch a script. The output (text file, gzip binary file, lzop binary file, or bzip2 binary file) must be in a form that the Amazon Redshift COPY command can ingest. For more information, see [Preparing your input data](t_preparing-input-data.md).  
publickey  <a name="copy-ssh-manifest-publickey"></a>
(Optional) The public key of the host. If provided, Amazon Redshift will use the public key to identify the host. If the public key isn't provided, Amazon Redshift will not attempt host identification. For example, if the remote host's public key is `ssh-rsa AbcCbaxxx…Example root@amazon.com`, type the following text in the public key field: `"AbcCbaxxx…Example"`  
mandatory  <a name="copy-ssh-manifest-mandatory"></a>
(Optional) A clause that indicates whether the COPY command should fail if the connection attempt fails. The default is `false`. If Amazon Redshift doesn't successfully make at least one connection, the COPY command fails.  
username  <a name="copy-ssh-manifest-username"></a>
(Optional) The user name that will be used to log on to the host system and run the remote command. The user login name must be the same as the login that was used to add the Amazon Redshift cluster's public key to the host's authorized keys file. The default username is `redshift`.
For more information about creating a manifest file, see [Loading data process](loading-data-from-remote-hosts.md#load-from-host-process).  
To COPY from a remote host, the SSH parameter must be specified with the COPY command. If the SSH parameter isn't specified, COPY assumes that the file specified with FROM is a data file and will fail.   
If you use automatic compression, the COPY command performs two data read operations, which means it will run the remote command twice. The first read operation is to provide a data sample for compression analysis, then the second read operation actually loads the data. If executing the remote command twice might cause a problem, you should disable automatic compression. To disable automatic compression, run the COPY command with the COMPUPDATE parameter set to OFF. For more information, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).  
For detailed procedures for using COPY from SSH, see [Loading data from remote hosts](loading-data-from-remote-hosts.md).

*authorization*  
The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an AWS Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

SSH  <a name="copy-ssh"></a>
A clause that specifies that data is to be loaded from a remote host using the SSH protocol. If you specify SSH, you must also provide a manifest file using the [s3://copy_from_ssh_manifest_file](#copy-ssh-manifest) argument.   
If you are using SSH to copy from a host using a private IP address in a remote VPC, the VPC must have enhanced VPC routing enabled. For more information about Enhanced VPC routing, see [Amazon Redshift Enhanced VPC Routing](https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html).

## Optional parameters
<a name="copy-parameters-data-source-ssh-optional-parms"></a>

You can optionally specify the following parameters with COPY from SSH: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ [Data format parameters](copy-parameters-data-format.md#copy-data-format-parameters)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters
<a name="copy-parameters-data-source-ssh-unsupported-parms"></a>

You can't use the following parameters with COPY from SSH: 
+ ENCRYPTED
+ MANIFEST
+ READRATIO

# COPY from Amazon DynamoDB
<a name="copy-parameters-data-source-dynamodb"></a>

To load data from an existing DynamoDB table, use the FROM clause to specify the DynamoDB table name.

**Topics**
+ [Syntax](#copy-parameters-data-source-dynamodb-syntax)
+ [Examples](#copy-parameters-data-source-dynamodb-examples)
+ [Optional parameters](#copy-parameters-data-source-dynamodb-optional-parms)
+ [Unsupported parameters](#copy-parameters-data-source-dynamodb-unsupported-parms)

**Important**  
If the DynamoDB table doesn't reside in the same region as your Amazon Redshift cluster, you must use the REGION parameter to specify the region in which the data is located. 

## Syntax
<a name="copy-parameters-data-source-dynamodb-syntax"></a>

```
FROM 'dynamodb://table-name' 
authorization
READRATIO ratio
| REGION [AS] 'aws_region'  
| optional-parameters
```

## Examples
<a name="copy-parameters-data-source-dynamodb-examples"></a>

The following example loads data from a DynamoDB table. 

```
copy favoritemovies from 'dynamodb://ProductCatalog'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
readratio 50;
```

### Parameters
<a name="copy-parameters-data-source-dynamodb-parameters"></a>

FROM  
The source of the data to be loaded. 

'dynamodb://*table-name*'  <a name="copy-dynamodb"></a>
The name of the DynamoDB table that contains the data, for example `'dynamodb://ProductCatalog'`. For details about how DynamoDB attributes are mapped to Amazon Redshift columns, see [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md).  
A DynamoDB table name is unique to an AWS account, which is identified by the AWS access credentials.

*authorization*  
The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, DynamoDB, and Amazon EC2. You can provide that authorization by referencing an AWS Identity and Access Management (IAM) role that is attached to your cluster (role-based access control) or by providing the access credentials for a user (key-based access control). For increased security and flexibility, we recommend using IAM role-based access control. For more information, see [Authorization parameters](copy-parameters-authorization.md).

READRATIO [AS] *ratio*  <a name="copy-readratio"></a>
The percentage of the DynamoDB table's provisioned throughput to use for the data load. READRATIO is required for COPY from DynamoDB. It can't be used with COPY from Amazon S3. We highly recommend setting the ratio to a value less than the average unused provisioned throughput. Valid values are integers 1–200.  
Setting READRATIO to 100 or higher enables Amazon Redshift to consume the entirety of the DynamoDB table's provisioned throughput, which seriously degrades the performance of concurrent read operations against the same table during the COPY session. Write traffic is unaffected. Values higher than 100 are allowed to troubleshoot rare scenarios when Amazon Redshift fails to fulfill the provisioned throughput of the table. If you load data from DynamoDB to Amazon Redshift on an ongoing basis, consider organizing your DynamoDB tables as a time series to separate live traffic from the COPY operation.

## Optional parameters
<a name="copy-parameters-data-source-dynamodb-optional-parms"></a>

You can optionally specify the following parameters with COPY from Amazon DynamoDB: 
+ [Column mapping options](copy-parameters-column-mapping.md)
+ The following data conversion parameters are supported:
  + [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate) 
  + [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull) 
  + [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat) 
  + [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull) 
  + [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) 
  + [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat) 
  + [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks) 
  + [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns) 
+ [Data load operations](copy-parameters-data-load.md)

## Unsupported parameters
<a name="copy-parameters-data-source-dynamodb-unsupported-parms"></a>

You can't use the following parameters with COPY from DynamoDB: 
+ All data format parameters
+ ESCAPE
+ FILLRECORD
+ IGNOREBLANKLINES
+ IGNOREHEADER
+ NULL
+ REMOVEQUOTES
+ ACCEPTINVCHARS
+ MANIFEST
+ ENCRYPTED

# Authorization parameters
<a name="copy-parameters-authorization"></a>

The COPY command needs authorization to access data in another AWS resource, including in Amazon S3, Amazon EMR, Amazon DynamoDB, and Amazon EC2. You can provide that authorization by referencing an [AWS Identity and Access Management (IAM) role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) that is attached to your cluster (*role-based access control*). You can encrypt your load data on Amazon S3. 

The following topics provide more details and examples of authentication options:
+ [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions)
+ [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based)
+ [Key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based)

Use one of the following to provide authorization for the COPY command: 
+ [Using the IAM\$1ROLE parameter](#copy-iam-role) parameter
+ [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](#copy-access-key-id) parameters
+ [Using the CREDENTIALS parameter](#copy-credentials) clause

## Using the IAM\$1ROLE parameter
<a name="copy-iam-role"></a>

### IAM\$1ROLE
<a name="copy-iam-role-iam"></a>

Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the COPY command runs. 

Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. If you specify IAM\$1ROLE, you can't use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, SESSION\$1TOKEN, or CREDENTIALS.

The following shows the syntax for the IAM\$1ROLE parameter. 

```
IAM_ROLE { default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' }
```

For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

## Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters
<a name="copy-access-key-id"></a>

### ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY
<a name="copy-access-key-id-access"></a>

This authorization method is not recommended. 

**Note**  
Instead of providing access credentials as plain text, we strongly recommend using role-based authentication by specifying the IAM\$1ROLE parameter. For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

### SESSION\$1TOKEN
<a name="copy-token"></a>

The session token for use with temporary access credentials. When SESSION\$1TOKEN is specified, you must also use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY to provide temporary access key credentials. If you specify SESSION\$1TOKEN you can't use IAM\$1ROLE or CREDENTIALS. For more information, see [Temporary security credentials](copy-usage_notes-access-permissions.md#r_copy-temporary-security-credentials) in the IAM User Guide.

**Note**  
Instead of creating temporary security credentials, we strongly recommend using role-based authentication. When you authorize using an IAM role, Amazon Redshift automatically creates temporary user credentials for each session. For more information, see [Role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based). 

The following shows the syntax for the SESSION\$1TOKEN parameter with the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters. 

```
ACCESS_KEY_ID '<access-key-id>'
SECRET_ACCESS_KEY '<secret-access-key>'
SESSION_TOKEN '<temporary-token>';
```

If you specify SESSION\$1TOKEN you can't use CREDENTIALS or IAM\$1ROLE. 

## Using the CREDENTIALS parameter
<a name="copy-credentials"></a>

### CREDENTIALS
<a name="copy-credentials-cred"></a>

A clause that indicates the method your cluster will use when accessing other AWS resources that contain data files or manifest files. You can't use the CREDENTIALS parameter with IAM\$1ROLE or ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY.

The following shows the syntax for the CREDENTIALS parameter.

```
[WITH] CREDENTIALS [AS] 'credentials-args'
```

**Note**  
For increased flexibility, we recommend using the [IAM\$1ROLE](#copy-iam-role-iam) parameter instead of the CREDENTIALS parameter.

Optionally, if the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) parameter is used, the *credentials-args* string also provides the encryption key.

The *credentials-args* string is case-sensitive and must not contain spaces.

The keywords WITH and AS are optional and are ignored.

You can specify either [role-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-role-based.phrase) or [key-based access control](copy-usage_notes-access-permissions.md#copy-usage_notes-access-key-based.phrase). In either case, the IAM role or user must have the permissions required to access the specified AWS resources. For more information, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](copy-usage_notes-access-permissions.md#copy-usage_notes-iam-permissions). 

**Note**  
To safeguard your AWS credentials and protect sensitive data, we strongly recommend using role-based access control. 

To specify role-based access control, provide the *credentials-args* string in the following format.

```
'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>'
```

To use temporary token credentials, you must provide the temporary access key ID, the temporary secret access key, and the temporary token. The *credentials-args* string is in the following format. 

```
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;token=<temporary-token>'
```

A COPY command using role-based access control with temporary credentials would resemble the following sample statement: 

```
COPY customer FROM 's3://amzn-s3-demo-bucket/mydata' 
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key-id>;token=<temporary-token>'
```

 For more information, see [Temporary security credentials](copy-usage_notes-access-permissions.md#r_copy-temporary-security-credentials).

If the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) parameter is used, the *credentials-args* string is in the following format, where *<root-key>* is the value of the root key that was used to encrypt the files.

```
CREDENTIALS
'<credentials-args>;master_symmetric_key=<root-key>'
```

A COPY command using role-based access control with an encryption key would resemble the following sample statement:

```
COPY customer FROM 's3://amzn-s3-demo-bucket/mydata' 
CREDENTIALS 
'aws_iam_role=arn:aws:iam::<account-id>:role/<role-name>;master_symmetric_key=<root-key>'
```

# Column mapping options
<a name="copy-parameters-column-mapping"></a>

By default, COPY inserts values into the target table's columns in the same order as fields occur in the data files. If the default column order will not work, you can specify a column list or use JSONPath expressions to map source data fields to the target columns. 
+ [Column List](#copy-column-list)
+ [JSONPaths File](#copy-column-mapping-jsonpaths)

## Column list
<a name="copy-column-list"></a>

You can specify a comma-separated list of column names to load source data fields into specific target columns. The columns can be in any order in the COPY statement, but when loading from flat files, such as in an Amazon S3 bucket, their order must match the order of the source data. 

When loading from an Amazon DynamoDB table, order doesn't matter. The COPY command matches attribute names in the items retrieved from the DynamoDB table to column names in the Amazon Redshift table. For more information, see [Loading data from an Amazon DynamoDB table](t_Loading-data-from-dynamodb.md)

 The format for a column list is as follows.

```
COPY tablename (column1 [,column2, ...]) 
```

If a column in the target table is omitted from the column list, then COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression.

If the target column doesn't have a default, then COPY attempts to load NULL.

If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails. 

If an [IDENTITY](r_CREATE_TABLE_NEW.md#identity-clause) column is included in the column list, then [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids) must also be specified; if an IDENTITY column is omitted, then EXPLICIT\$1IDS can't be specified. If no column list is specified, the command behaves as if a complete, in-order column list was specified, with IDENTITY columns omitted if EXPLICIT\$1IDS was also not specified.

If a column is defined with GENERATED BY DEFAULT AS IDENTITY, then it can be copied. Values are generated or updated with values that you supply. The EXPLICIT\$1IDS option isn't required. COPY doesn't update the identity high watermark. For more information, see [GENERATED BY DEFAULT AS IDENTITY](r_CREATE_TABLE_NEW.md#identity-generated-bydefault-clause). 

## JSONPaths file
<a name="copy-column-mapping-jsonpaths"></a>

When loading from data files in JSON or Avro format, COPY automatically maps the data elements in the JSON or Avro source data to the columns in the target table. It does so by matching field names in the Avro schema to column names in the target table or column list.

In some cases, your column names and field names don't match, or you need to map to deeper levels in the data hierarchy. In these cases, you can use a JSONPaths file to explicitly map JSON or Avro data elements to columns. 

For more information, see [JSONPaths file](copy-parameters-data-format.md#copy-json-jsonpaths). 

# Data format parameters
<a name="copy-parameters-data-format"></a>

By default, the COPY command expects the source data to be character-delimited UTF-8 text. The default delimiter is a pipe character ( \$1 ). If the source data is in another format, use the following parameters to specify the data format: 
+ [FORMAT](#copy-format)
+ [CSV](#copy-csv)
+ [DELIMITER](#copy-delimiter) 
+ [FIXEDWIDTH](#copy-fixedwidth) 
+ [SHAPEFILE](#copy-shapefile) 
+ [AVRO](#copy-avro) 
+ [JSON format for COPY](#copy-json) 
+ [PARQUET](#copy-parquet) 
+ [ORC](#copy-orc) 

In addition to the standard data formats, COPY supports the following columnar data formats for COPY from Amazon S3: 
+ [ORC](#copy-orc) 
+ [PARQUET](#copy-parquet) 

COPY from columnar format is supported with certain restriction. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md). <a name="copy-data-format-parameters"></a>Data format parameters

FORMAT [AS]  <a name="copy-format"></a>
(Optional) Identifies data format keywords. The FORMAT arguments are described following.

CSV [ QUOTE [AS] *'quote\$1character'* ]  <a name="copy-csv"></a>
Enables use of CSV format in the input data. To automatically escape delimiters, newline characters, and carriage returns, enclose the field in the character specified by the QUOTE parameter. The default quotation mark character is a double quotation mark ( " ). When the quotation mark character is used within a field, escape the character with an additional quotation mark character. For example, if the quotation mark character is a double quotation mark, to insert the string `A "quoted" word` the input file should include the string `"A ""quoted"" word"`. When the CSV parameter is used, the default delimiter is a comma ( , ). You can specify a different delimiter by using the DELIMITER parameter.   
When a field is enclosed in quotation marks, white space between the delimiters and the quotation mark characters is ignored. If the delimiter is a white space character, such as a tab, the delimiter isn't treated as white space.  
CSV can't be used with FIXEDWIDTH, REMOVEQUOTES, or ESCAPE.     
QUOTE [AS] *'quote\$1character'*  <a name="copy-csv-quote"></a>
Optional. Specifies the character to be used as the quotation mark character when using the CSV parameter. The default is a double quotation mark ( " ). If you use the QUOTE parameter to define a quotation mark character other than double quotation mark, you don't need to escape double quotation marks within the field. The QUOTE parameter can be used only with the CSV parameter. The AS keyword is optional.

DELIMITER [AS] ['*delimiter\$1char*']   <a name="copy-delimiter"></a>
Specifies characters that are used to separate fields in the input file, such as a pipe character ( `|` ), a comma ( `,` ), a tab ( `\t` ), or multiple characters such as `|~|`. Non-printable characters are supported. Characters can also be represented in octal as their UTF-8 code units. For octal, use the format '\$1ddd', where 'd' is an octal digit (0–7). The default delimiter is a pipe character ( `|` ), unless the CSV parameter is used, in which case the default delimiter is a comma ( `,` ). The AS keyword is optional. DELIMITER can't be used with FIXEDWIDTH.

FIXEDWIDTH '*fixedwidth\$1spec*'   <a name="copy-fixedwidth"></a>
Loads the data from a file where each column width is a fixed length, rather than columns being separated by a delimiter. The *fixedwidth\$1spec* is a string that specifies a user-defined column label and column width. The column label can be either a text string or an integer, depending on what the user chooses. The column label has no relation to the column name. The order of the label/width pairs must match the order of the table columns exactly. FIXEDWIDTH can't be used with CSV or DELIMITER. In Amazon Redshift, the length of CHAR and VARCHAR columns is expressed in bytes, so be sure that the column width that you specify accommodates the binary length of multibyte characters when preparing the file to be loaded. For more information, see [Character types](r_Character_types.md).   
The format for *fixedwidth\$1spec* is shown following:   

```
'colLabel1:colWidth1,colLabel:colWidth2, ...'
```

SHAPEFILE [ SIMPLIFY [AUTO] [*'tolerance'*] ]  <a name="copy-shapefile"></a>
Enables use of SHAPEFILE format in the input data. By default, the first column of the shapefile is either a `GEOMETRY` or `IDENTITY` column. All subsequent columns follow the order specified in the shapefile.  
You can't use SHAPEFILE with FIXEDWIDTH, REMOVEQUOTES, or ESCAPE.   
To use `GEOGRAPHY` objects with `COPY FROM SHAPEFILE`, first ingest into a `GEOMETRY` column, and then cast the objects to `GEOGRAPHY` objects. .    
SIMPLIFY [*tolerance*]  <a name="copy-shapefile-simplify"></a>
(Optional) Simplifies all geometries during the ingestion process using the Ramer-Douglas-Peucker algorithm and the given tolerance.   
SIMPLIFY AUTO [*tolerance*]  <a name="copy-shapefile-simplify"></a>
(Optional) Simplifies only geometries that are larger than the maximum geometry size. This simplification uses the Ramer-Douglas-Peucker algorithm and the automatically calculated tolerance if this doesn't exceed the specified tolerance. This algorithm calculates the size to store objects within the tolerance specified. The *tolerance* value is optional.
For examples of loading shapefiles, see [Loading a shapefile into Amazon Redshift](r_COPY_command_examples.md#copy-example-spatial-copy-shapefile).

AVRO [AS] '*avro\$1option*'  <a name="copy-avro"></a>
Specifies that the source data is in Avro format.   
Avro format is supported for COPY from these services and protocols:  
+ Amazon S3 
+ Amazon EMR 
+ Remote hosts (SSH) 
Avro isn't supported for COPY from DynamoDB.   
Avro is a data serialization protocol. An Avro source file includes a schema that defines the structure of the data. The Avro schema type must be `record`. COPY accepts Avro files created using the default uncompressed codec as well as the `deflate` and `snappy` compression codecs. For more information about Avro, go to [Apache Avro](https://avro.apache.org/).   
Valid values for *avro\$1option* are as follows:  
+ `'auto'`
+ `'auto ignorecase'`
+ `'s3://jsonpaths_file'` 
The default is `'auto'`.  
COPY automatically maps the data elements in the Avro source data to the columns in the target table. It does so by matching field names in the Avro schema to column names in the target table. The matching is case-sensitive for `'auto'` and isn't case-sensitive for `'auto ignorecase'`.   
Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto'` option, matching field names must also be lowercase. If the field names aren't all lowercase, you can use the `'auto ignorecase'` option. With the default `'auto'` argument, COPY recognizes only the first level of fields, or *outer fields*, in the structure.   
To explicitly map column names to Avro field names, you can use a [JSONPaths file](#copy-json-jsonpaths).   
By default, COPY attempts to match all columns in the target table to Avro field names. To load a subset of the columns, you can optionally specify a column list. If a column in the target table is omitted from the column list, COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression. If the target column doesn't have a default, COPY attempts to load NULL. If a column is included in the column list and COPY doesn't find a matching field in the Avro data, COPY attempts to load NULL to the column.   
If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails.   
<a name="copy-avro-schema"></a>**Avro Schema**  
An Avro source data file includes a schema that defines the structure of the data. COPY reads the schema that is part of the Avro source data file to map data elements to target table columns. The following example shows an Avro schema.   

```
{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "guid", "type": "string"},
        {"name": "name", "type": "string"},
        {"name": "address", "type": "string"}]
}
```
The Avro schema is defined using JSON format. The top-level JSON object contains three name-value pairs with the names, or *keys*, `"name"`, `"type"`, and `"fields"`.   
The `"fields"` key pairs with an array of objects that define the name and data type of each field in the data structure. By default, COPY automatically matches the field names to column names. Column names are always lowercase, so matching field names must also be lowercase, unless you specify the `‘auto ignorecase’` option. Any field names that don't match a column name are ignored. Order doesn't matter. In the previous example, COPY maps to the column names `id`, `guid`, `name`, and `address`.   
With the default `'auto'` argument, COPY matches only the first-level objects to columns. To map to deeper levels in the schema, or if field names and column names don't match, use a JSONPaths file to define the mapping. For more information, see [JSONPaths file](#copy-json-jsonpaths).   
If the value associated with a key is a complex Avro data type such as byte, array, record, map, or link, COPY loads the value as a string. Here, the string is the JSON representation of the data. COPY loads Avro enum data types as strings, where the content is the name of the type. For an example, see [COPY from JSON format](copy-usage_notes-copy-from-json.md).  
The maximum size of the Avro file header, which includes the schema and file metadata, is 1 MB.     
The maximum size of a single Avro data block is 4 MB. This is distinct from the maximum row size. If the maximum size of a single Avro data block is exceeded, even if the resulting row size is less than the 4 MB row-size limit, the COPY command fails.   
In calculating row size, Amazon Redshift internally counts pipe characters ( \$1 ) twice. If your input data contains a very large number of pipe characters, it is possible for row size to exceed 4 MB even if the data block is less than 4 MB.

JSON [AS] '*json\$1option*'  <a name="copy-json"></a>
The source data is in JSON format.   
JSON format is supported for COPY from these services and protocols:  
+ Amazon S3
+ COPY from Amazon EMR
+ COPY from SSH
JSON isn't supported for COPY from DynamoDB.   
Valid values for *json\$1option* are as follows :  
+ `'auto'`
+ `'auto ignorecase'`
+ `'s3://jsonpaths_file'` 
+ `'noshred'` 
The default is `'auto'`. Amazon Redshift doesn't shred the attributes of JSON structures into multiple columns while loading a JSON document.  
By default, COPY attempts to match all columns in the target table to JSON field name keys. To load a subset of the columns, you can optionally specify a column list. If the JSON field name keys aren't all lowercase, you can use the `'auto ignorecase'` option or a [JSONPaths file](#copy-json-jsonpaths) to explicitly map column names to JSON field name keys.  
If a column in the target table is omitted from the column list, then COPY loads the target column's [DEFAULT](r_CREATE_TABLE_NEW.md#create-table-default) expression. If the target column doesn't have a default, COPY attempts to load NULL. If a column is included in the column list and COPY doesn't find a matching field in the JSON data, COPY attempts to load NULL to the column.   
If COPY attempts to assign NULL to a column that is defined as NOT NULL, the COPY command fails.   
COPY maps the data elements in the JSON source data to the columns in the target table. It does so by matching *object keys*, or names, in the source name-value pairs to the names of columns in the target table.   
Refer to the following details about each *json\$1option* value:    
'auto'  <a name="copy-json-auto"></a>
With this option, matching is case-sensitive. Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto'` option, matching JSON field names must also be lowercase.  
'auto ignorecase'  <a name="copy-json-auto-ignorecase"></a>
With this option, the matching isn't case-sensitive. Column names in Amazon Redshift tables are always lowercase, so when you use the `'auto ignorecase'` option, the corresponding JSON field names can be lowercase, uppercase, or mixed-case.   
's3://*jsonpaths\$1file*'  <a name="copy-json-pathfile"></a>
With this option, COPY uses the named JSONPaths file to map the data elements in the JSON source data to the columns in the target table. The *`s3://jsonpaths_file`* argument must be an Amazon S3 object key that explicitly references a single file. An example is `'s3://amzn-s3-demo-bucket/jsonpaths.txt`'. The argument can't be a key prefix. For more information about using a JSONPaths file, see [JSONPaths file](#copy-json-jsonpaths).  
In some cases, the file specified by `jsonpaths_file` has the same prefix as the path specified by `copy_from_s3_objectpath` for the data files. If so, COPY reads the JSONPaths file as a data file and returns errors. For example, suppose that your data files use the object path `s3://amzn-s3-demo-bucket/my_data.json` and your JSONPaths file is `s3://amzn-s3-demo-bucket/my_data.jsonpaths`. In this case, COPY attempts to load `my_data.jsonpaths` as a data file.  
'noshred'  <a name="copy-json-noshred"></a>
With this option, Amazon Redshift doesn't shred the attributes of JSON structures into multiple columns while loading a JSON document.

## JSON data file
<a name="copy-json-data-file"></a>

The JSON data file contains a set of either objects or arrays. COPY loads each JSON object or array into one row in the target table. Each object or array corresponding to a row must be a stand-alone, root-level structure; that is, it must not be a member of another JSON structure.

A JSON *object* begins and ends with braces  ( \$1 \$1 ) and contains an unordered collection of name-value pairs. Each paired name and value are separated by a colon, and the pairs are separated by commas. By default, the *object key*, or name, in the name-value pairs must match the name of the corresponding column in the table. Column names in Amazon Redshift tables are always lowercase, so matching JSON field name keys must also be lowercase. If your column names and JSON keys don't match, use a [JSONPaths file](#copy-json-jsonpaths) to explicitly map columns to keys. 

Order in a JSON object doesn't matter. Any names that don't match a column name are ignored. The following shows the structure of a simple JSON object.

```
{
  "column1": "value1",
  "column2": value2,
  "notacolumn" : "ignore this value"
}
```

A JSON *array* begins and ends with brackets ( [  ] ), and contains an ordered collection of values separated by commas. If your data files use arrays, you must specify a JSONPaths file to match the values to columns. The following shows the structure of a simple JSON array. 

```
["value1", value2]
```

The JSON must be well-formed. For example, the objects or arrays can't be separated by commas or any other characters except white space. Strings must be enclosed in double quotation mark characters. Quote characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks.

The maximum size of a single JSON object or array, including braces or brackets, is 4 MB. This is distinct from the maximum row size. If the maximum size of a single JSON object or array is exceeded, even if the resulting row size is less than the 4 MB row-size limit, the COPY command fails. 

In calculating row size, Amazon Redshift internally counts pipe characters ( \$1 ) twice. If your input data contains a very large number of pipe characters, it is possible for row size to exceed 4 MB even if the object size is less than 4 MB.

COPY loads `\n` as a newline character and loads `\t` as a tab character. To load a backslash, escape it with a backslash ( `\\` ).

COPY searches the specified JSON source for a well-formed, valid JSON object or array. If COPY encounters any non–white-space characters before locating a usable JSON structure, or between valid JSON objects or arrays, COPY returns an error for each instance. These errors count toward the MAXERROR error count. When the error count equals or exceeds MAXERROR, COPY fails. 

For each error, Amazon Redshift records a row in the STL\$1LOAD\$1ERRORS system table. The LINE\$1NUMBER column records the last line of the JSON object that caused the error. 

If IGNOREHEADER is specified, COPY ignores the specified number of lines in the JSON data. Newline characters in the JSON data are always counted for IGNOREHEADER calculations. 

COPY loads empty strings as empty fields by default. If EMPTYASNULL is specified, COPY loads empty strings for CHAR and VARCHAR fields as NULL. Empty strings for other data types, such as INT, are always loaded with NULL. 

The following options aren't supported with JSON: 
+ CSV
+ DELIMITER 
+ ESCAPE
+ FILLRECORD 
+ FIXEDWIDTH
+ IGNOREBLANKLINES
+ NULL AS
+ READRATIO
+ REMOVEQUOTES 

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). For more information about JSON data structures, go to [www.json.org](https://www.json.org/). 

## JSONPaths file
<a name="copy-json-jsonpaths"></a>

If you are loading from JSON-formatted or Avro source data, by default COPY maps the first-level data elements in the source data to the columns in the target table. It does so by matching each name, or object key, in a name-value pair to the name of a column in the target table. 

If your column names and object keys don't match, or to map to deeper levels in the data hierarchy, you can use a JSONPaths file to explicitly map JSON or Avro data elements to columns. The JSONPaths file maps JSON data elements to columns by matching the column order in the target table or column list.

The JSONPaths file must contain only a single JSON object (not an array). The JSON object is a name-value pair. The *object key*, which is the name in the name-value pair, must be `"jsonpaths"`. The *value* in the name-value pair is an array of *JSONPath expressions*. Each JSONPath expression references a single element in the JSON data hierarchy or Avro schema, similarly to how an XPath expression refers to elements in an XML document. For more information, see [JSONPath expressions](#copy-json-jsonpath-expressions).

To use a JSONPaths file, add the JSON or AVRO keyword to the COPY command. Specify the S3 bucket name and object path of the JSONPaths file using the following format.

```
COPY tablename 
FROM 'data_source' 
CREDENTIALS 'credentials-args' 
FORMAT AS { AVRO | JSON } 's3://jsonpaths_file';
```

The `s3://jsonpaths_file` value must be an Amazon S3 object key that explicitly references a single file, such as `'s3://amzn-s3-demo-bucket/jsonpaths.txt'`. It can't be a key prefix. 

In some cases, if you're loading from Amazon S3 the file specified by `jsonpaths_file` has the same prefix as the path specified by `copy_from_s3_objectpath` for the data files. If so, COPY reads the JSONPaths file as a data file and returns errors. For example, suppose that your data files use the object path `s3://amzn-s3-demo-bucket/my_data.json` and your JSONPaths file is `s3://amzn-s3-demo-bucket/my_data.jsonpaths`. In this case, COPY attempts to load `my_data.jsonpaths` as a data file.

 If the key name is any string other than `"jsonpaths"`, the COPY command doesn't return an error, but it ignores *jsonpaths\$1file* and uses the `'auto'` argument instead. 

If any of the following occurs, the COPY command fails:
+ The JSON is malformed.
+ There is more than one JSON object.
+ Any characters except white space exist outside the object.
+ An array element is an empty string or isn't a string.

MAXERROR doesn't apply to the JSONPaths file. 

The JSONPaths file must not be encrypted, even if the [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted) option is specified.

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). 

## JSONPath expressions
<a name="copy-json-jsonpath-expressions"></a>

The JSONPaths file uses JSONPath expressions to map data fields to target columns. Each JSONPath expression corresponds to one column in the Amazon Redshift target table. The order of the JSONPath array elements must match the order of the columns in the target table or the column list, if a column list is used. 

The double quotation mark characters are required as shown, both for the field names and the values. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks.

If an object element referenced by a JSONPath expression isn't found in the JSON data, COPY attempts to load a NULL value. If the referenced object is malformed, COPY returns a load error. 

If an array element referenced by a JSONPath expression isn't found in the JSON or Avro data, COPY fails with the following error: `Invalid JSONPath format: Not an array or index out of range.` Remove any array elements from the JSONPaths that don't exist in the source data and verify that the arrays in the source data are well formed.  

The JSONPath expressions can use either bracket notation or dot notation, but you can't mix notations. The following example shows JSONPath expressions using bracket notation. 

```
{
    "jsonpaths": [
        "$['venuename']",
        "$['venuecity']",
        "$['venuestate']",
        "$['venueseats']"
    ]
}
```

The following example shows JSONPath expressions using dot notation. 

```
{
    "jsonpaths": [
        "$.venuename",
        "$.venuecity",
        "$.venuestate",
        "$.venueseats"
    ]
}
```

In the context of Amazon Redshift COPY syntax, a JSONPath expression must specify the explicit path to a single name element in a JSON or Avro hierarchical data structure. Amazon Redshift doesn't support any JSONPath elements, such as wildcard characters or filter expressions, that might resolve to an ambiguous path or multiple name elements.

For more information, see [COPY from JSON format](copy-usage_notes-copy-from-json.md). 

## Using JSONPaths with Avro Data
<a name="using-jsonpath-with-avro"></a>

The following example shows an Avro schema with multiple levels.

```
{
    "name": "person",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "guid", "type": "string"},
        {"name": "isActive", "type": "boolean"},
        {"name": "age", "type": "int"},
        {"name": "name", "type": "string"},
        {"name": "address", "type": "string"},
        {"name": "latitude", "type": "double"},
        {"name": "longitude", "type": "double"},
        {
            "name": "tags",
            "type": {
                        "type" : "array",
                        "name" : "inner_tags",
                        "items" : "string"
                    }
        },
        {
            "name": "friends",
            "type": {
                        "type" : "array",
                        "name" : "inner_friends",
                        "items" : {
                                    "name" : "friends_record",
                                    "type" : "record",
                                    "fields" : [
                                                 {"name" : "id", "type" : "int"},
                                                 {"name" : "name", "type" : "string"}
                                               ]
                                  }
                    }
        },
        {"name": "randomArrayItem", "type": "string"}
    ]
}
```

The following example shows a JSONPaths file that uses AvroPath expressions to reference the previous schema. 

```
{
    "jsonpaths": [
        "$.id",
        "$.guid",
        "$.address",
        "$.friends[0].id"
    ]
}
```

The JSONPaths example includes the following elements:

jsonpaths  
The name of the JSON object that contains the AvroPath expressions.

[ … ]  
Brackets enclose the JSON array that contains the path elements.

\$1  
The dollar sign refers to the root element in the Avro schema, which is the `"fields"` array.

"\$1.id",  
The target of the AvroPath expression. In this instance, the target is the element in the `"fields"` array with the name `"id"`. The expressions are separated by commas.

"\$1.friends[0].id"  
Brackets indicate an array index. JSONPath expressions use zero-based indexing, so this expression references the first element in the `"friends"` array with the name `"id"`.

The Avro schema syntax requires using *inner fields* to define the structure of record and array data types. The inner fields are ignored by the AvroPath expressions. For example, the field `"friends"` defines an array named `"inner_friends"`, which in turn defines a record named `"friends_record"`. The AvroPath expression to reference the field `"id"` can ignore the extra fields to reference the target field directly. The following AvroPath expressions reference the two fields that belong to the `"friends"` array.

```
"$.friends[0].id"
"$.friends[0].name"
```

## Columnar data format parameters
<a name="copy-parameters-columnar-data"></a>

In addition to the standard data formats, COPY supports the following columnar data formats for COPY from Amazon S3. COPY from columnar format is supported with certain restrictions. For more information, see [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md). 

ORC  <a name="copy-orc"></a>
Loads the data from a file that uses Optimized Row Columnar (ORC) file format. 

PARQUET  <a name="copy-parquet"></a>
Loads the data from a file that uses Parquet file format. 

# File compression parameters
<a name="copy-parameters-file-compression"></a>

You can load from compressed data files by specifying the following parameters. File compression parameters

BZIP2   <a name="copy-bzip2"></a>
A value that specifies that the input file or files are in compressed bzip2 format (.bz2 files). The COPY operation reads each compressed file and uncompresses the data as it loads.

GZIP   <a name="copy-gzip"></a>
A value that specifies that the input file or files are in compressed gzip format (.gz files). The COPY operation reads each compressed file and uncompresses the data as it loads.

LZOP   <a name="copy-lzop"></a>
A value that specifies that the input file or files are in compressed lzop format (.lzo files). The COPY operation reads each compressed file and uncompresses the data as it loads.  
COPY doesn't support files that are compressed using the lzop *--filter* option.

ZSTD   <a name="copy-zstd"></a>
A value that specifies that the input file or files are in compressed Zstandard format (.zst files). The COPY operation reads each compressed file and uncompresses the data as it loads.  
ZSTD is supported only with COPY from Amazon S3.

# Data conversion parameters
<a name="copy-parameters-data-conversion"></a>

As it loads the table, COPY attempts to implicitly convert the strings in the source data to the data type of the target column. If you need to specify a conversion that is different from the default behavior, or if the default conversion results in errors, you can manage data conversions by specifying the following parameters. For more information on the syntax of these parameters, see [COPY syntax](https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html#r_COPY-syntax).
+ [ACCEPTANYDATE](#copy-acceptanydate) 
+ [ACCEPTINVCHARS](#copy-acceptinvchars) 
+ [BLANKSASNULL](#copy-blanksasnull) 
+ [DATEFORMAT](#copy-dateformat) 
+ [EMPTYASNULL](#copy-emptyasnull) 
+ [ENCODING](#copy-encoding) 
+ [ESCAPE](#copy-escape) 
+ [EXPLICIT_IDS](#copy-explicit-ids) 
+ [FILLRECORD](#copy-fillrecord) 
+ [IGNOREBLANKLINES](#copy-ignoreblanklines) 
+ [IGNOREHEADER](#copy-ignoreheader) 
+ [NULL AS](#copy-null-as) 
+ [REMOVEQUOTES](#copy-removequotes) 
+ [ROUNDEC](#copy-roundec) 
+ [TIMEFORMAT](#copy-timeformat) 
+ [TRIMBLANKS](#copy-trimblanks) 
+ [TRUNCATECOLUMNS](#copy-truncatecolumns) <a name="copy-data-conversion-parameters"></a>Data conversion parameters

ACCEPTANYDATE   <a name="copy-acceptanydate"></a>
Allows any date format, including invalid formats such as `00/00/00 00:00:00`, to be loaded without generating an error. This parameter applies only to TIMESTAMP and DATE columns. Always use ACCEPTANYDATE with the DATEFORMAT parameter. If the date format for the data doesn't match the DATEFORMAT specification, Amazon Redshift inserts a NULL value into that field.

ACCEPTINVCHARS [AS] ['*replacement\$1char*']   <a name="copy-acceptinvchars"></a>
Enables loading of data into VARCHAR columns even if the data contains invalid UTF-8 characters. When ACCEPTINVCHARS is specified, COPY replaces each invalid UTF-8 character with a string of equal length consisting of the character specified by *replacement\$1char*. For example, if the replacement character is '`^`', an invalid three-byte character will be replaced with '`^^^`'.  
 The replacement character can be any ASCII character except NULL. The default is a question mark ( ? ). For information about invalid UTF-8 characters, see [Multibyte character load errors](multi-byte-character-load-errors.md).  
COPY returns the number of rows that contained invalid UTF-8 characters, and it adds an entry to the [STL\$1REPLACEMENTS](r_STL_REPLACEMENTS.md) system table for each affected row, up to a maximum of 100 rows for each node slice. Additional invalid UTF-8 characters are also replaced, but those replacement events aren't recorded.  
If ACCEPTINVCHARS isn't specified, COPY returns an error whenever it encounters an invalid UTF-8 character.   
ACCEPTINVCHARS is valid only for VARCHAR columns.

BLANKSASNULL   <a name="copy-blanksasnull"></a>
Loads blank fields, which consist of only white space characters, as NULL. This option applies only to CHAR and VARCHAR columns. Blank fields for other data types, such as INT, are always loaded with NULL. For example, a string that contains three space characters in succession (and no other characters) is loaded as a NULL. The default behavior, without this option, is to load the space characters as is. 

DATEFORMAT [AS] \$1'*dateformat\$1string*' \$1 'auto' \$1  <a name="copy-dateformat"></a>
If no DATEFORMAT is specified, the default format is `'YYYY-MM-DD'`. For example, an alternative valid format is `'MM-DD-YYYY'`.   
If the COPY command doesn't recognize the format of your date or time values, or if your date or time values use different formats, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string. The `'auto'`' keyword is case-sensitive. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).   
The date format can include time information (hour, minutes, seconds), but this information is ignored. The AS keyword is optional. For more information, see [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md).

EMPTYASNULL   <a name="copy-emptyasnull"></a>
Indicates that Amazon Redshift should load empty CHAR and VARCHAR fields as NULL. Empty fields for other data types, such as INT, are always loaded with NULL. Empty fields occur when data contains two delimiters in succession with no characters between the delimiters. EMPTYASNULL and NULL AS '' (empty string) produce the same behavior.

ENCODING [AS] *file\$1encoding*  <a name="copy-encoding"></a>
Specifies the encoding type of the load data. The COPY command converts the data from the specified encoding into UTF-8 during loading.   
Valid values for *file\$1encoding* are as follows:  
+ `UTF8`
+ `UTF16`
+ `UTF16LE`
+ `UTF16BE`
+ `ISO88591`
The default is `UTF8`.  
Source file names must use UTF-8 encoding.  
The following files must use UTF-8 encoding, even if a different encoding is specified for the load data:  
+ Manifest files
+ JSONPaths files
The argument strings provided with the following parameters must use UTF-8:  
+ FIXEDWIDTH '*fixedwidth\$1spec*'
+ ACCEPTINVCHARS '*replacement\$1char*'
+ DATEFORMAT '*dateformat\$1string*'
+ TIMEFORMAT '*timeformat\$1string*'
+ NULL AS '*null\$1string*'
Fixed-width data files must use UTF-8 encoding. The field widths are based on the number of characters, not the number of bytes.   
All load data must use the specified encoding. If COPY encounters a different encoding, it skips the file and returns an error.   
If you specify `UTF16`, then your data must have a byte order mark (BOM). If you know whether your UTF-16 data is little-endian (LE) or big-endian (BE), you can use `UTF16LE` or `UTF16BE`, regardless of the presence of a BOM.   
To use ISO-8859-1 encoding specify `ISO88591`. For more information, see [ISO/IEC 8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) in *Wikipedia*.

ESCAPE   <a name="copy-escape"></a>
When this parameter is specified, the backslash character (`\`) in input data is treated as an escape character. The character that immediately follows the backslash character is loaded into the table as part of the current column value, even if it is a character that normally serves a special purpose. For example, you can use this parameter to escape the delimiter character, a quotation mark, an embedded newline character, or the escape character itself when any of these characters is a legitimate part of a column value.  
If you specify the ESCAPE parameter in combination with the REMOVEQUOTES parameter, you can escape and retain quotation marks (`'` or `"`) that might otherwise be removed. The default null string, `\N`, works as is, but it can also be escaped in the input data as `\\N`. As long as you don't specify an alternative null string with the NULL AS parameter, `\N` and `\\N` produce the same results.  
The control character `0x00` (NUL) can't be escaped and should be removed from the input data or converted. This character is treated as an end of record (EOR) marker, causing the remainder of the record to be truncated.
You can't use the ESCAPE parameter for FIXEDWIDTH loads, and you can't specify the escape character itself; the escape character is always the backslash character. Also, you must ensure that the input data contains the escape character in the appropriate places.  
Here are some examples of input data and the resulting loaded data when the ESCAPE parameter is specified. The result for row 4 assumes that the REMOVEQUOTES parameter is also specified. The input data consists of two pipe-delimited fields:   

```
1|The quick brown fox\[newline]
jumped over the lazy dog.
2| A\\B\\C
3| A \| B \| C
4| 'A Midsummer Night\'s Dream'
```
The data loaded into column 2 looks like this:   

```
The quick brown fox
jumped over the lazy dog.
A\B\C
A|B|C
A Midsummer Night's Dream
```
Applying the escape character to the input data for a load is the responsibility of the user. One exception to this requirement is when you reload data that was previously unloaded with the ESCAPE parameter. In this case, the data will already contain the necessary escape characters.
The ESCAPE parameter doesn't interpret octal, hex, Unicode, or other escape sequence notation. For example, if your source data contains the octal line feed value (`\012`) and you try to load this data with the ESCAPE parameter, Amazon Redshift loads the value `012` into the table and doesn't interpret this value as a line feed that is being escaped.  
In order to escape newline characters in data that originates from Microsoft Windows platforms, you might need to use two escape characters: one for the carriage return and one for the line feed. Alternatively, you can remove the carriage returns before loading the file (for example, by using the dos2unix utility).

EXPLICIT\$1IDS   <a name="copy-explicit-ids"></a>
Use EXPLICIT\$1IDS with tables that have IDENTITY columns if you want to override the autogenerated values with explicit values from the source data files for the tables. If the command includes a column list, that list must include the IDENTITY columns to use this parameter. The data format for EXPLICIT\$1IDS values must match the IDENTITY format specified by the CREATE TABLE definition.  
When you run a COPY command against a table with the EXPLICIT\$1IDS option, Amazon Redshift does not check the uniqueness of IDENTITY columns in the table.  
If a column is defined with GENERATED BY DEFAULT AS IDENTITY, then it can be copied. Values are generated or updated with values that you supply. The EXPLICIT\$1IDS option isn't required. COPY doesn't update the identity high watermark.  
 For an example of a COPY command using EXPLICIT\$1IDS, see [Load VENUE with explicit values for an IDENTITY column](r_COPY_command_examples.md#r_COPY_command_examples-load-venue-with-explicit-values-for-an-identity-column).

FILLRECORD   <a name="copy-fillrecord"></a>
Allows data files to be loaded when contiguous columns are missing at the end of some of the records. The missing columns are loaded as NULLs. For text and CSV formats, if the missing column is a VARCHAR column, zero-length strings are loaded instead of NULLs. To load NULLs to VARCHAR columns from text and CSV, specify the EMPTYASNULL keyword. NULL substitution only works if the column definition allows NULLs.  
For example, if the table definition contains four nullable CHAR columns, and a record contains the values `apple, orange, banana, mango`, the COPY command could load and fill in a record that contains only the values `apple, orange`. The missing CHAR values would be loaded as NULL values.

IGNOREBLANKLINES   <a name="copy-ignoreblanklines"></a>
Ignores blank lines that only contain a line feed in a data file and does not try to load them.

IGNOREHEADER [ AS ] *number\$1rows*   <a name="copy-ignoreheader"></a>
Treats the specified *number\$1rows* as a file header and doesn't load them. Use IGNOREHEADER to skip file headers in all files in a parallel load.

NULL AS '*null\$1string*'   <a name="copy-null-as"></a>
Loads fields that match *null\$1string* as NULL, where *null\$1string* can be any string. If your data includes a null terminator, also referred to as NUL (UTF-8 0000) or binary zero (0x000), COPY treats it as any other character. For example, a record containing '1' \$1\$1 NUL \$1\$1 '2' is copied as string of length 3 bytes. If a field contains only NUL, you can use NULL AS to replace the null terminator with NULL by specifying `'\0'` or `'\000'`—for example, `NULL AS '\0'` or `NULL AS '\000'`. If a field contains a string that ends with NUL and NULL AS is specified, the string is inserted with NUL at the end. Do not use '\$1n' (newline) for the *null\$1string* value. Amazon Redshift reserves '\$1n' for use as a line delimiter. The default *null\$1string* is `'\N`'.  
If you attempt to load nulls into a column defined as NOT NULL, the COPY command will fail.

REMOVEQUOTES   <a name="copy-removequotes"></a>
Removes surrounding quotation marks from strings in the incoming data. All characters within the quotation marks, including delimiters, are retained. If a string has a beginning single or double quotation mark but no corresponding ending mark, the COPY command fails to load that row and returns an error. The following table shows some simple examples of strings that contain quotation marks and the resulting loaded values.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html)

ROUNDEC   <a name="copy-roundec"></a>
Rounds up numeric values when the scale of the input value is greater than the scale of the column. By default, COPY truncates values when necessary to fit the scale of the column. For example, if a value of `20.259` is loaded into a DECIMAL(8,2) column, COPY truncates the value to `20.25` by default. If ROUNDEC is specified, COPY rounds the value to `20.26`. The INSERT command always rounds values when necessary to match the column's scale, so a COPY command with the ROUNDEC parameter behaves the same as an INSERT command.

TIMEFORMAT [AS] \$1'*timeformat\$1string*' \$1 'auto' \$1 'epochsecs' \$1 'epochmillisecs' \$1  <a name="copy-timeformat"></a>
Specifies the time format. If no TIMEFORMAT is specified, the default format is `YYYY-MM-DD HH:MI:SS` for TIMESTAMP columns or `YYYY-MM-DD HH:MI:SSOF` for TIMESTAMPTZ columns, where `OF` is the offset from Coordinated Universal Time (UTC). You can't include a time zone specifier in the *timeformat\$1string*. To load TIMESTAMPTZ data that is in a format different from the default format, specify 'auto'; for more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md). For more information about *timeformat\$1string*, see [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md).  
The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string. If the COPY command doesn't recognize the format of your date or time values, or if your date and time values use formats different from each other, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).   
If your source data is represented as epoch time, that is the number of seconds or milliseconds since January 1, 1970, 00:00:00 UTC, specify `'epochsecs'` or `'epochmillisecs'`.   
The `'auto'`, `'epochsecs'`, and `'epochmillisecs'` keywords are case-sensitive.  
The AS keyword is optional.

TRIMBLANKS   <a name="copy-trimblanks"></a>
Removes the trailing white space characters from a VARCHAR string. This parameter applies only to columns with a VARCHAR data type.

TRUNCATECOLUMNS   <a name="copy-truncatecolumns"></a>
Truncates data in columns to the appropriate number of characters so that it fits the column specification. Applies only to columns with a VARCHAR or CHAR data type, and rows 4 MB or less in size.

# Data load operations
<a name="copy-parameters-data-load"></a>

Manage the default behavior of the load operation for troubleshooting or to reduce load times by specifying the following parameters. 
+ [COMPROWS](#copy-comprows) 
+ [COMPUPDATE](#copy-compupdate) 
+ [IGNOREALLERRORS](#copy-ignoreallerrors) 
+ [MAXERROR](#copy-maxerror) 
+ [NOLOAD](#copy-noload) 
+ [STATUPDATE](#copy-statupdate) <a name="copy-data-load-parameters"></a>Parameters

COMPROWS *numrows*   <a name="copy-comprows"></a>
Specifies the number of rows to be used as the sample size for compression analysis. The analysis is run on rows from each data slice. For example, if you specify `COMPROWS 1000000` (1,000,000) and the system contains four total slices, no more than 250,000 rows for each slice are read and analyzed.  
If COMPROWS isn't specified, the sample size defaults to 100,000 for each slice. Values of COMPROWS lower than the default of 100,000 rows for each slice are automatically upgraded to the default value. However, automatic compression will not take place if the amount of data being loaded is insufficient to produce a meaningful sample.  
If the COMPROWS number is greater than the number of rows in the input file, the COPY command still proceeds and runs the compression analysis on all of the available rows. The accepted range for this argument is a number between 1000 and 2147483647 (2,147,483,647).

COMPUPDATE [ PRESET \$1 \$1 ON \$1 TRUE \$1 \$1 \$1 OFF \$1 FALSE \$1 ]  <a name="copy-compupdate"></a>
Controls whether compression encodings are automatically applied during a COPY.   
When COMPUPDATE is PRESET, the COPY command chooses the compression encoding for each column if the target table is empty; even if the columns already have encodings other than RAW. Currently specified column encodings can be replaced. Encoding for each column is based on the column data type. No data is sampled. Amazon Redshift automatically assigns compression encoding as follows:  
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, or DOUBLE PRECISION data types are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR or VARCHAR are assigned LZO compression.
When COMPUPDATE is omitted, the COPY command chooses the compression encoding for each column only if the target table is empty and you have not specified an encoding (other than RAW) for any of the columns. The encoding for each column is determined by Amazon Redshift. No data is sampled.   
When COMPUPDATE is ON (or TRUE), or COMPUPDATE is specified without an option, the COPY command applies automatic compression if the table is empty; even if the table columns already have encodings other than RAW. Currently specified column encodings can be replaced. Encoding for each column is based on an analysis of sample data. For more information, see [Loading tables with automatic compression](c_Loading_tables_auto_compress.md).  
When COMPUPDATE is OFF (or FALSE), automatic compression is disabled. Column encodings aren't changed.  
For information about the system table to analyze compression, see [STL\$1ANALYZE\$1COMPRESSION](r_STL_ANALYZE_COMPRESSION.md). 

IGNOREALLERRORS   <a name="copy-ignoreallerrors"></a>
You can specify this option to ignore all errors that occur during the load operation.   
You can't specify the IGNOREALLERRORS option if you specify the MAXERROR option. You can't specify the IGNOREALLERRORS option for columnar formats including ORC and Parquet.

MAXERROR [AS] *error\$1count*   <a name="copy-maxerror"></a>
If the load returns the *error\$1count* number of errors or greater, the load fails. If the load returns fewer errors, it continues and returns an INFO message that states the number of rows that could not be loaded. Use this parameter to allow loads to continue when certain rows fail to load into the table because of formatting errors or other inconsistencies in the data.   
Set this value to `0` or `1` if you want the load to fail as soon as the first error occurs. The AS keyword is optional. The MAXERROR default value is `0` and the limit is `100000`.  
 The actual number of errors reported might be greater than the specified MAXERROR because of the parallel nature of Amazon Redshift. If any node in the Amazon Redshift cluster detects that MAXERROR has been exceeded, each node reports all of the errors it has encountered.

NOLOAD   <a name="copy-noload"></a>
Checks the validity of the data file without actually loading the data. Use the NOLOAD parameter to make sure that your data file loads without any errors before running the actual data load. Running COPY with the NOLOAD parameter is much faster than loading the data because it only parses the files.

STATUPDATE [ \$1 ON \$1 TRUE \$1 \$1 \$1 OFF \$1 FALSE \$1 ]  <a name="copy-statupdate"></a>
Governs automatic computation and refresh of optimizer statistics at the end of a successful COPY command. By default, if the STATUPDATE parameter isn't used, statistics are updated automatically if the table is initially empty.  
Whenever ingesting data into a nonempty table significantly changes the size of the table, we recommend updating statistics either by running an [ANALYZE](r_ANALYZE.md) command or by using the STATUPDATE ON argument.  
With STATUPDATE ON (or TRUE), statistics are updated automatically regardless of whether the table is initially empty. If STATUPDATE is used, the current user must be either the table owner or a superuser. If STATUPDATE is not specified, only INSERT permission is required.  
With STATUPDATE OFF (or FALSE), statistics are never updated.  
For additional information, see [Analyzing tables](t_Analyzing_tables.md).

# Alphabetical parameter list
<a name="r_COPY-alphabetical-parm-list"></a>

The following list provides links to each COPY command parameter description, sorted alphabetically.
+ [ACCEPTANYDATE](copy-parameters-data-conversion.md#copy-acceptanydate)
+ [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars)
+ [ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY](copy-parameters-authorization.md#copy-access-key-id-access)
+ [AVRO](copy-parameters-data-format.md#copy-avro)
+ [BLANKSASNULL](copy-parameters-data-conversion.md#copy-blanksasnull)
+ [BZIP2](copy-parameters-file-compression.md#copy-bzip2) 
+ [COMPROWS](copy-parameters-data-load.md#copy-comprows)
+ [COMPUPDATE](copy-parameters-data-load.md#copy-compupdate)
+ [CREDENTIALS](copy-parameters-authorization.md#copy-credentials-cred)
+ [CSV](copy-parameters-data-format.md#copy-csv)
+ [DATEFORMAT](copy-parameters-data-conversion.md#copy-dateformat)
+ [DELIMITER](copy-parameters-data-format.md#copy-delimiter)
+ [EMPTYASNULL](copy-parameters-data-conversion.md#copy-emptyasnull)
+ [ENCODING](copy-parameters-data-conversion.md#copy-encoding)
+ [ENCRYPTED](copy-parameters-data-source-s3.md#copy-encrypted)
+ [ESCAPE](copy-parameters-data-conversion.md#copy-escape)
+ [EXPLICIT_IDS](copy-parameters-data-conversion.md#copy-explicit-ids)
+ [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord)
+ [FIXEDWIDTH](copy-parameters-data-format.md#copy-fixedwidth)
+ [FORMAT](copy-parameters-data-format.md#copy-format)
+ [FROM](copy-parameters-data-source-s3.md#copy-parameters-from)
+ [GZIP](copy-parameters-file-compression.md#copy-gzip)
+ [IAM\$1ROLE](copy-parameters-authorization.md#copy-iam-role-iam)
+ [IGNOREALLERRORS](copy-parameters-data-load.md#copy-ignoreallerrors)
+ [IGNOREBLANKLINES](copy-parameters-data-conversion.md#copy-ignoreblanklines)
+ [IGNOREHEADER](copy-parameters-data-conversion.md#copy-ignoreheader)
+ [JSON format for COPY](copy-parameters-data-format.md#copy-json)
+ [LZOP](copy-parameters-file-compression.md#copy-lzop)
+ [MANIFEST](copy-parameters-data-source-s3.md#copy-manifest)
+ [MASTER_SYMMETRIC_KEY](copy-parameters-data-source-s3.md#copy-master-symmetric-key)
+ [MAXERROR](copy-parameters-data-load.md#copy-maxerror)
+ [NOLOAD](copy-parameters-data-load.md#copy-noload)
+ [NULL AS](copy-parameters-data-conversion.md#copy-null-as)
+ [READRATIO](copy-parameters-data-source-dynamodb.md#copy-readratio)
+ [REGION](copy-parameters-data-source-s3.md#copy-region)
+ [REMOVEQUOTES](copy-parameters-data-conversion.md#copy-removequotes)
+ [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec)
+ [SESSION\$1TOKEN](copy-parameters-authorization.md#copy-token)
+ [SHAPEFILE](copy-parameters-data-format.md#copy-shapefile)
+ [SSH](copy-parameters-data-source-ssh.md#copy-ssh)
+ [STATUPDATE](copy-parameters-data-load.md#copy-statupdate)
+ [TIMEFORMAT](copy-parameters-data-conversion.md#copy-timeformat)
+ [TRIMBLANKS](copy-parameters-data-conversion.md#copy-trimblanks)
+ [TRUNCATECOLUMNS](copy-parameters-data-conversion.md#copy-truncatecolumns)
+ [ZSTD](copy-parameters-file-compression.md#copy-zstd)

# Usage notes
<a name="r_COPY_usage_notes"></a>

**Topics**
+ [Permissions to access other AWS Resources](copy-usage_notes-access-permissions.md)
+ [Using COPY with Amazon S3 access point aliases](copy-usage_notes-s3-access-point-alias.md)
+ [Loading multibyte data from Amazon S3](copy-usage_notes-multi-byte.md)
+ [Loading a column of the GEOMETRY or GEOGRAPHY data type](copy-usage_notes-spatial-data.md)
+ [Loading the HLLSKETCH data type](copy-usage_notes-hll.md)
+ [Loading a column of the VARBYTE data type](copy-usage-varbyte.md)
+ [Errors when reading multiple files](copy-usage_notes-multiple-files.md)
+ [COPY from JSON format](copy-usage_notes-copy-from-json.md)
+ [COPY from columnar data formats](copy-usage_notes-copy-from-columnar.md)
+ [DATEFORMAT and TIMEFORMAT strings](r_DATEFORMAT_and_TIMEFORMAT_strings.md)
+ [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md)

# Permissions to access other AWS Resources
<a name="copy-usage_notes-access-permissions"></a>

 To move data between your cluster and another AWS resource, such as Amazon S3, Amazon DynamoDB, Amazon EMR, or Amazon EC2, your cluster must have permission to access the resource and perform the necessary actions. For example, to load data from Amazon S3, COPY must have LIST access to the bucket and GET access for the bucket objects. For information about minimum permissions, see [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions).

To get authorization to access the resource, your cluster must be authenticated. You can choose either of the following authentication methods: 
+ [Role-based access control](#copy-usage_notes-access-role-based) – For role-based access control, you specify an AWS Identity and Access Management (IAM) role that your cluster uses for authentication and authorization. To safeguard your AWS credentials and sensitive data, we strongly recommend using role-based authentication.
+ [Key-based access control](#copy-usage_notes-access-key-based) – For key-based access control, you provide the AWS access credentials (access key ID and secret access key) for a user as plain text.

## Role-based access control
<a name="copy-usage_notes-access-role-based"></a>

With <a name="copy-usage_notes-access-role-based.phrase"></a>role-based access control, your cluster temporarily assumes an IAM role on your behalf. Then, based on the authorizations granted to the role, your cluster can access the required AWS resources.

Creating an IAM *role* is similar to granting permissions to a user, in that it is an AWS identity with permissions policies that determine what the identity can and can't do in AWS. However, instead of being uniquely associated with one user, a role can be assumed by any entity that needs it. Also, a role doesn’t have any credentials (a password or access keys) associated with it. Instead, if a role is associated with a cluster, access keys are created dynamically and provided to the cluster.

We recommend using role-based access control because it provides more secure, fine-grained control of access to AWS resources and sensitive user data, in addition to safeguarding your AWS credentials.

Role-based authentication delivers the following benefits:
+ You can use AWS standard IAM tools to define an IAM role and associate the role with multiple clusters. When you modify the access policy for a role, the changes are applied automatically to all clusters that use the role.
+ You can define fine-grained IAM policies that grant permissions for specific clusters and database users to access specific AWS resources and actions.
+ Your cluster obtains temporary session credentials at run time and refreshes the credentials as needed until the operation completes. If you use key-based temporary credentials, the operation fails if the temporary credentials expire before it completes.
+ Your access key ID and secret access key ID aren't stored or transmitted in your SQL code.

To use role-based access control, you must first create an IAM role using the Amazon Redshift service role type, and then attach the role to your cluster. The role must have, at a minimum, the permissions listed in [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions). For steps to create an IAM role and attach it to your cluster, see [Authorizing Amazon Redshift to Access Other AWS Services On Your Behalf](https://docs.aws.amazon.com/redshift/latest/mgmt/authorizing-redshift-service.html) in the *Amazon Redshift Management Guide*.

You can add a role to a cluster or view the roles associated with a cluster by using the Amazon Redshift Management Console, CLI, or API. For more information, see [Associating an IAM Role With a Cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/copy-unload-iam-role.html) in the *Amazon Redshift Management Guide*.

When you create an IAM role, IAM returns an Amazon Resource Name (ARN) for the role. To specify an IAM role, provide the role ARN with either the [Using the IAM\$1ROLE parameter](copy-parameters-authorization.md#copy-iam-role) parameter or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter. 

For example, suppose the following role is attached to the cluster.

```
"IamRoleArn": "arn:aws:iam::0123456789012:role/MyRedshiftRole"
```

The following COPY command example uses the IAM\$1ROLE parameter with the ARN in the previous example for authentication and access to Amazon S3.

```
copy customer from 's3://amzn-s3-demo-bucket/mydata'  
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The following COPY command example uses the CREDENTIALS parameter to specify the IAM role.

```
copy customer from 's3://amzn-s3-demo-bucket/mydata' 
credentials 
'aws_iam_role=arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

In addition, a superuser can grant the ASSUMEROLE privilege to database users and groups to provide access to a role for COPY operations. For information, see [GRANT](r_GRANT.md).

## Key-based access control
<a name="copy-usage_notes-access-key-based"></a>

With <a name="copy-usage_notes-access-key-based.phrase"></a>key-based access control, you provide the access key ID and secret access key for an IAM user that is authorized to access the AWS resources that contain the data. You can use either the [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](copy-parameters-authorization.md#copy-access-key-id) parameters together or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter.

**Note**  
We strongly recommend using an IAM role for authentication instead of supplying a plain-text access key ID and secret access key. If you choose key-based access control, never use your AWS account (root) credentials. Always create an IAM user and provide that user's access key ID and secret access key. For steps to create an IAM user, see [Creating an IAM User in Your AWS Account](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html).

To authenticate using ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, replace *<access-key-id>* and *<secret-access-key>* with an authorized user's access key ID and full secret access key as shown following. 

```
ACCESS_KEY_ID '<access-key-id>'
SECRET_ACCESS_KEY '<secret-access-key>';
```

To authenticate using the CREDENTIALS parameter, replace *<access-key-id>* and *<secret-access-key>* with an authorized user's access key ID and full secret access key as shown following.

```
CREDENTIALS
'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>';
```

The IAM user must have, at a minimum, the permissions listed in [IAM permissions for COPY, UNLOAD, and CREATE LIBRARY](#copy-usage_notes-iam-permissions).

### Temporary security credentials
<a name="r_copy-temporary-security-credentials"></a>

 If you are using key-based access control, you can further limit the access users have to your data by using temporary security credentials. Role-based authentication automatically uses temporary credentials. 

**Note**  
We strongly recommend using [role-based access control](#copy-usage_notes-access-role-based.phrase) instead of creating temporary credentials and providing access key ID and secret access key as plain text. Role-based access control automatically uses temporary credentials. 

Temporary security credentials provide enhanced security because they have short lifespans and can't be reused after they expire. The access key ID and secret access key generated with the token can't be used without the token, and a user who has these temporary security credentials can access your resources only until the credentials expire.

To grant users temporary access to your resources, you call AWS Security Token Service (AWS STS) API operations. The AWS STS API operations return temporary security credentials consisting of a security token, an access key ID, and a secret access key. You issue the temporary security credentials to the users who need temporary access to your resources. These users can be existing IAM users, or they can be non-AWS users. For more information about creating temporary security credentials, see [Using Temporary Security Credentials](https://docs.aws.amazon.com/STS/latest/UsingSTS/Welcome.html) in the IAM User Guide.

You can use either the [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](copy-parameters-authorization.md#copy-access-key-id) parameters together with the [SESSION\$1TOKEN](copy-parameters-authorization.md#copy-token) parameter or the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter. You must also supply the access key ID and secret access key that were provided with the token.

To authenticate using ACCESS\$1KEY\$1ID, SECRET\$1ACCESS\$1KEY, and SESSION\$1TOKEN, replace *<temporary-access-key-id>*, *<temporary-secret-access-key>*, and *<temporary-token>* as shown following. 

```
ACCESS_KEY_ID '<temporary-access-key-id>'
SECRET_ACCESS_KEY '<temporary-secret-access-key>'
SESSION_TOKEN '<temporary-token>';
```

To authenticate using CREDENTIALS, include `session_token=<temporary-token>` in the credentials string as shown following. 

```
CREDENTIALS
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;session_token=<temporary-token>';
```

The following example shows a COPY command with temporary security credentials.

```
copy table-name
from 's3://objectpath'
access_key_id '<temporary-access-key-id>'
secret_access_key '<temporary-secret-access-key>'
session_token '<temporary-token>';
```

The following example loads the LISTING table with temporary credentials and file encryption.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt'
access_key_id '<temporary-access-key-id>'
secret_access_key '<temporary-secret-access-key>'
session_token '<temporary-token>'
master_symmetric_key '<root-key>'
encrypted;
```

The following example loads the LISTING table using the CREDENTIALS parameter with temporary credentials and file encryption.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt'
credentials 
'aws_access_key_id=<temporary-access-key-id>;aws_secret_access_key=<temporary-secret-access-key>;session_token=<temporary-token>;master_symmetric_key=<root-key>'
encrypted;
```

**Important**  
The temporary security credentials must be valid for the entire duration of the COPY or UNLOAD operation. If the temporary security credentials expire during the operation, the command fails and the transaction is rolled back. For example, if temporary security credentials expire after 15 minutes and the COPY operation requires one hour, the COPY operation fails before it completes. If you use role-based access, the temporary security credentials are automatically refreshed until the operation completes.

## IAM permissions for COPY, UNLOAD, and CREATE LIBRARY
<a name="copy-usage_notes-iam-permissions"></a>

The IAM role or user referenced by the CREDENTIALS parameter must have, at a minimum, the following permissions:
+ For COPY from Amazon S3, permission to LIST the Amazon S3 bucket and GET the Amazon S3 objects that are being loaded, and the manifest file, if one is used.
+ For COPY from Amazon S3, Amazon EMR, and remote hosts (SSH) with JSON-formatted data, permission to LIST and GET the JSONPaths file on Amazon S3, if one is used. 
+ For COPY from DynamoDB, permission to SCAN and DESCRIBE the DynamoDB table that is being loaded. 
+ For COPY from an Amazon EMR cluster, permission for the `ListInstances` action on the Amazon EMR cluster. 
+ For UNLOAD to Amazon S3, GET, LIST, and PUT permissions for the Amazon S3 bucket to which the data files are being unloaded.
+ For CREATE LIBRARY from Amazon S3, permission to LIST the Amazon S3 bucket and GET the Amazon S3 objects being imported.

**Note**  
If you receive the error message `S3ServiceException: Access Denied`, when running a COPY, UNLOAD, or CREATE LIBRARY command, your cluster doesn’t have proper access permissions for Amazon S3.

You can manage IAM permissions by attaching an IAM policy to an IAM role that is attached to your cluster, to a user, or to the group to which your user belongs. For example, the `AmazonS3ReadOnlyAccess` managed policy grants LIST and GET permissions to Amazon S3 resources. For more information about IAM policies, see [Managing IAM Policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage.html) in the *IAM User Guide*. 

# Using COPY with Amazon S3 access point aliases
<a name="copy-usage_notes-s3-access-point-alias"></a>

COPY supports Amazon S3 access point aliases. For more information, see [Using a bucket–style alias for your access point](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points-alias.html) in the *Amazon Simple Storage Service User Guide*.

# Loading multibyte data from Amazon S3
<a name="copy-usage_notes-multi-byte"></a>

If your data includes non-ASCII multibyte characters (such as Chinese or Cyrillic characters), you must load the data to VARCHAR columns. The VARCHAR data type supports four-byte UTF-8 characters, but the CHAR data type only accepts single-byte ASCII characters. You can't load five-byte or longer characters into Amazon Redshift tables. For more information, see [Multibyte characters](c_Supported_data_types.md#c_Supported_data_types-multi-byte-characters). 

# Loading a column of the GEOMETRY or GEOGRAPHY data type
<a name="copy-usage_notes-spatial-data"></a>

You can COPY to `GEOMETRY` or `GEOGRAPHY` columns from data in a character-delimited text file, such as a CSV file. The data must be in the hexadecimal form of the well-known binary format (either WKB or EWKB) or the well-known text format (either WKT or EWKT) and fit within the maximum size of a single input row to the COPY command. For more information, see [COPY](r_COPY.md). 

For information about how to load from a shapefile, see [Loading a shapefile into Amazon Redshift](spatial-copy-shapefile.md).

For more information about the `GEOMETRY` or `GEOGRAPHY` data type, see [Querying spatial data in Amazon Redshift](geospatial-overview.md).

# Loading the HLLSKETCH data type
<a name="copy-usage_notes-hll"></a>

You can copy HLL sketches only in sparse or dense format supported by Amazon Redshift. To use the COPY command on HyperLogLog sketches, use the Base64 format for dense HyperLogLog sketches and the JSON format for sparse HyperLogLog sketches. For more information, see [HyperLogLog functions](hyperloglog-functions.md). 

The following example imports data from a CSV file into a table using CREATE TABLE and COPY. First, the example creates the table `t1` using CREATE TABLE.

```
CREATE TABLE t1 (sketch hllsketch, a bigint);
```

Then it uses COPY to import data from a CSV file into the table `t1`. 

```
COPY t1 FROM s3://amzn-s3-demo-bucket/unload/' IAM_ROLE 'arn:aws:iam::0123456789012:role/MyRedshiftRole' NULL AS 'null' CSV;
```

# Loading a column of the VARBYTE data type
<a name="copy-usage-varbyte"></a>

You can load data from a file in CSV, Parquet, and ORC format. For CSV, the data is loaded from a file in hexadecimal representation of the VARBYTE data. You can't load VARBYTE data with the `FIXEDWIDTH` option. The `ADDQUOTES` or `REMOVEQUOTES` option of COPY is not supported. A VARBYTE column can't be used as a partition column. 

# Errors when reading multiple files
<a name="copy-usage_notes-multiple-files"></a>

The COPY command is atomic and transactional. In other words, even when the COPY command reads data from multiple files, the entire process is treated as a single transaction. If COPY encounters an error reading a file, it automatically retries until the process times out (see [statement\$1timeout](r_statement_timeout.md)) or if data can't be download from Amazon S3 for a prolonged period of time (between 15 and 30 minutes), ensuring that each file is loaded only once. If the COPY command fails, the entire transaction is canceled and all changes are rolled back. For more information about handling load errors, see [Troubleshooting data loads](t_Troubleshooting_load_errors.md). 

After a COPY command is successfully initiated, it doesn't fail if the session terminates, for example when the client disconnects. However, if the COPY command is within a BEGIN … END transaction block that doesn't complete because the session terminates, the entire transaction, including the COPY, is rolled back. For more information about transactions, see [BEGIN](r_BEGIN.md).

# COPY from JSON format
<a name="copy-usage_notes-copy-from-json"></a>

The JSON data structure is made up of a set of objects or arrays. A JSON *object* begins and ends with braces, and contains an unordered collection of name-value pairs. Each name and value are separated by a colon, and the pairs are separated by commas. The name is a string in double quotation marks. The quotation mark characters must be simple quotation marks (0x22), not slanted or "smart" quotation marks. 

A JSON *array* begins and ends with brackets, and contains an ordered collection of values separated by commas. A value can be a string in double quotation marks, a number, a Boolean true or false, null, a JSON object, or an array. 

JSON objects and arrays can be nested, enabling a hierarchical data structure. The following example shows a JSON data structure with two valid objects. 

```
{
    "id": 1006410,
    "title": "Amazon Redshift Database Developer Guide"
}
{
    "id": 100540,
    "name": "Amazon Simple Storage Service User Guide"
}
```

The following shows the same data as two JSON arrays.

```
[
    1006410,
    "Amazon Redshift Database Developer Guide"
]
[
    100540,
    "Amazon Simple Storage Service User Guide"
]
```

## COPY options for JSON
<a name="copy-usage-json-options"></a>

You can specify the following options when using COPY with JSON format data: 
+ `'auto' `– COPY automatically loads fields from the JSON file. 
+ `'auto ignorecase'` – COPY automatically loads fields from the JSON file while ignoring the case of field names.
+ `s3://jsonpaths_file` – COPY uses a JSONPaths file to parse the JSON source data. A *JSONPaths file* is a text file that contains a single JSON object with the name `"jsonpaths"` paired with an array of JSONPath expressions. If the name is any string other than `"jsonpaths"`, COPY uses the `'auto'` argument instead of using the JSONPaths file. 

For examples that show how to load data using `'auto'`, `'auto ignorecase'`, or a JSONPaths file, and using either JSON objects or arrays, see [Copy from JSON examples](r_COPY_command_examples.md#r_COPY_command_examples-copy-from-json). 

## JSONPath option
<a name="copy-usage-json-options"></a>

In the Amazon Redshift COPY syntax, a JSONPath expression specifies the explicit path to a single name element in a JSON hierarchical data structure, using either bracket notation or dot notation. Amazon Redshift doesn't support any JSONPath elements, such as wildcard characters or filter expressions, that might resolve to an ambiguous path or multiple name elements. As a result, Amazon Redshift can't parse complex, multi-level data structures.

The following is an example of a JSONPaths file with JSONPath expressions using bracket notation. The dollar sign (\$1) represents the root-level structure. 

```
{
    "jsonpaths": [
       "$['id']",
       "$['store']['book']['title']",
	"$['location'][0]" 
    ]
}
```

 In the previous example, `$['location'][0]` references the first element in an array. JSON uses zero-based array indexing. Array indexes must be positive integers (greater than or equal to zero).

The following example shows the previous JSONPaths file using dot notation. 

```
{
    "jsonpaths": [
       "$.id",
       "$.store.book.title",
	"$.location[0]"
    ]
}
```

You can't mix bracket notation and dot notation in the `jsonpaths` array. Brackets can be used in both bracket notation and dot notation to reference an array element. 

When using dot notation, the JSONPath expressions can't contain the following characters: 
+ Single straight quotation mark ( ' ) 
+ Period, or dot ( . ) 
+ Brackets ( [ ] ) unless used to reference an array element 

If the value in the name-value pair referenced by a JSONPath expression is an object or an array, the entire object or array is loaded as a string, including the braces or brackets. For example, suppose that your JSON data contains the following object. 

```
{
    "id": 0,
    "guid": "84512477-fa49-456b-b407-581d0d851c3c",
    "isActive": true,
    "tags": [
        "nisi",
        "culpa",
        "ad",
        "amet",
        "voluptate",
        "reprehenderit",
        "veniam"
    ],
    "friends": [
        {
            "id": 0,
            "name": "Martha Rivera"
        },
        {
            "id": 1,
            "name": "Renaldo"
        }
    ]
}
```

The JSONPath expression `$['tags']` then returns the following value. 

```
"["nisi","culpa","ad","amet","voluptate","reprehenderit","veniam"]" 
```

The JSONPath expression `$['friends'][1]` then returns the following value. 

```
"{"id": 1,"name": "Renaldo"}" 
```

Each JSONPath expression in the `jsonpaths` array corresponds to one column in the Amazon Redshift target table. The order of the `jsonpaths` array elements must match the order of the columns in the target table or the column list, if a column list is used. 

For examples that show how to load data using either the `'auto'` argument or a JSONPaths file, and using either JSON objects or arrays, see [Copy from JSON examples](r_COPY_command_examples.md#r_COPY_command_examples-copy-from-json). 

For information on how to copy multiple JSON files, see [Using a manifest to specify data files](loading-data-files-using-manifest.md).

## Escape characters in JSON
<a name="copy-usage-json-escape-characters"></a>

COPY loads `\n` as a newline character and loads `\t` as a tab character. To load a backslash, escape it with a backslash ( `\\` ).

For example, suppose you have the following JSON in a file named `escape.json` in the bucket `s3://amzn-s3-demo-bucket/json/`.

```
{
  "backslash": "This is a backslash: \\",
  "newline": "This sentence\n is on two lines.",
  "tab": "This sentence \t contains a tab."
}
```

Run the following commands to create the ESCAPES table and load the JSON.

```
create table escapes (backslash varchar(25), newline varchar(35), tab varchar(35));

copy escapes from 's3://amzn-s3-demo-bucket/json/escape.json' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as json 'auto';
```

Query the ESCAPES table to view the results.

```
select * from escapes;

       backslash        |      newline      |               tab
------------------------+-------------------+----------------------------------
 This is a backslash: \ | This sentence     | This sentence    contains a tab.
                        :  is on two lines.
(1 row)
```

## Loss of numeric precision
<a name="copy-usage-json-rounding"></a>

You might lose precision when loading numbers from data files in JSON format to a column that is defined as a numeric data type. Some floating point values aren't represented exactly in computer systems. As a result, data you copy from a JSON file might not be rounded as you expect. To avoid a loss of precision, we recommend using one of the following alternatives:
+ Represent the number as a string by enclosing the value in double quotation characters.
+ Use [ROUNDEC](copy-parameters-data-conversion.md#copy-roundec) to round the number instead of truncating.
+ Instead of using JSON or Avro files, use CSV, character-delimited, or fixed-width text files.

# COPY from columnar data formats
<a name="copy-usage_notes-copy-from-columnar"></a>

COPY can load data from Amazon S3 in the following columnar formats:
+ ORC
+ Parquet

For examples of using COPY from columnar data formats, see [COPY examples](r_COPY_command_examples.md).

COPY supports columnar formatted data with the following considerations:
+ The Amazon S3 bucket must be in the same AWS Region as the Amazon Redshift database. 
+ To access your Amazon S3 data through a VPC endpoint, set up access using IAM policies and IAM roles as described in [Using Amazon Redshift Spectrum with Enhanced VPC Routing](https://docs.aws.amazon.com/redshift/latest/mgmt/spectrum-enhanced-vpc.html) in the *Amazon Redshift Management Guide*. 
+ COPY doesn't automatically apply compression encodings. 
+ Only the following COPY parameters are supported: 
  + [ACCEPTINVCHARS](copy-parameters-data-conversion.md#copy-acceptinvchars) when copying from an ORC or Parquet file.
  + [FILLRECORD](copy-parameters-data-conversion.md#copy-fillrecord)
  + [FROM](copy-parameters-data-source-s3.md#copy-parameters-from)
  + [IAM\$1ROLE](copy-parameters-authorization.md#copy-iam-role)
  + [CREDENTIALS](copy-parameters-authorization.md#copy-credentials)
  + [STATUPDATE ](copy-parameters-data-load.md#copy-statupdate)
  + [MANIFEST](copy-parameters-data-source-s3.md#copy-manifest)
  + [EXPLICIT\$1IDS](copy-parameters-data-conversion.md#copy-explicit-ids)
+ If COPY encounters an error while loading, the command fails. ACCEPTANYDATE and MAXERROR aren't supported for columnar data types.
+ Error messages are sent to the SQL client. Some errors are logged in STL\$1LOAD\$1ERRORS and STL\$1ERROR.
+ COPY inserts values into the target table's columns in the same order as the columns occur in the columnar data files. The number of columns in the target table and the number of columns in the data file must match.
+ If the file you specify for the COPY operation includes one of the following extensions, we decompress the data without the need for adding any parameters: 
  + `.gz`
  + `.snappy`
  + `.bz2`
+ COPY from the Parquet and ORC file formats uses Redshift Spectrum and the bucket access. To use COPY for these formats, be sure there are no IAM policies blocking the use of Amazon S3 presigned URLs. The presigned URLs generated by Amazon Redshift are valid for 1 hour so that Amazon Redshift has enough time to load all the files from the Amazon S3 bucket. A unique presigned URL is generated for each file scanned by COPY from columnar data formats. For bucket policies that include an `s3:signatureAge` action, make sure to set the value to at least 3,600,000 milliseconds. For more information, see [Using Amazon Redshift Spectrum with enhanced VPC routing](https://docs.aws.amazon.com/redshift/latest/mgmt/spectrum-enhanced-vpc.html).
+ The REGION parameter is not supported with COPY from columnar data formats. Even if your Amazon S3 bucket and your database are in the same AWS Region, you can encounter an error, such as, REGION argument is not supported for PARQUET based COPY.
+ COPY from columnar formats now support concurrency scaling. To enable concurrency scaling, see [Configuring concurrency scaling queues](https://docs.aws.amazon.com/redshift/latest/dg/concurrency-scaling.html#concurrency-scaling-queues).

# DATEFORMAT and TIMEFORMAT strings
<a name="r_DATEFORMAT_and_TIMEFORMAT_strings"></a>

The COPY command uses the DATEFORMAT and TIMEFORMAT options to parse date and time values in your source data. DATEFORMAT and TIMEFORMAT are formatted strings that must match the format of your source data's date and time values. For example, a COPY command loading source data with the date value `Jan-01-1999` must include the following DATEFORMAT string:

```
COPY ...
            DATEFORMAT AS 'MON-DD-YYYY'
```

For more information on managing COPY data conversions, see [Data conversion parameters](https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html). 

DATEFORMAT and TIMEFORMAT strings can contain datetime separators (such as '`-`', '`/`', or '`:`'), as well the datepart and timepart formats in the following table.

**Note**  
If you can't match the format of your date or time values with the following dateparts and timeparts, or if you have date and time values that use formats different from each other, use the `'auto'` argument with the DATEFORMAT or TIMEFORMAT parameter. The `'auto'` argument recognizes several formats that aren't supported when using a DATEFORMAT or TIMEFORMAT string. For more information, see [Using automatic recognition with DATEFORMAT and TIMEFORMAT](automatic-recognition.md).

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_DATEFORMAT_and_TIMEFORMAT_strings.html)

The default date format is YYYY-MM-DD. The default timestamp without time zone (TIMESTAMP) format is YYYY-MM-DD HH:MI:SS. The default timestamp with time zone (TIMESTAMPTZ) format is YYYY-MM-DD HH:MI:SSOF, where OF is the offset from UTC (for example, -8:00. You can't include a time zone specifier (TZ, tz, or OF) in the timeformat\$1string. The seconds (SS) field also supports fractional seconds up to a microsecond level of detail. To load TIMESTAMPTZ data that is in a format different from the default format, specify 'auto'.

Following are some sample dates or times you can encounter in your source data, and the corresponding DATEFORMAT or TIMEFORMAT strings for them.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_DATEFORMAT_and_TIMEFORMAT_strings.html)

## Example
<a name="r_DATEFORMAT_and_TIMEFORMAT_strings-examples"></a>

For an example of using TIMEFORMAT, see [Load a timestamp or datestamp](r_COPY_command_examples.md#r_COPY_command_examples-load-a-time-datestamp).

# Using automatic recognition with DATEFORMAT and TIMEFORMAT
<a name="automatic-recognition"></a>

If you specify `'auto'` as the argument for the DATEFORMAT or TIMEFORMAT parameter, Amazon Redshift will automatically recognize and convert the date format or time format in your source data. The following shows an example.

```
copy favoritemovies from 'dynamodb://ProductCatalog' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
dateformat 'auto';
```

When used with the `'auto'` argument for DATEFORMAT and TIMEFORMAT, COPY recognizes and converts the date and time formats listed in the table in [DATEFORMAT and TIMEFORMAT stringsExample](r_DATEFORMAT_and_TIMEFORMAT_strings.md). In addition, the `'auto'` argument recognizes the following formats that aren't supported when using a DATEFORMAT and TIMEFORMAT string.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/automatic-recognition.html)

Automatic recognition doesn't support epochsecs and epochmillisecs.

To test whether a date or timestamp value will be automatically converted, use a CAST function to attempt to convert the string to a date or timestamp value. For example, the following commands test the timestamp value `'J2345678 04:05:06.789'`:

```
create table formattest (test char(21));
insert into formattest values('J2345678 04:05:06.789');
select test, cast(test as timestamp) as timestamp, cast(test as date) as date from formattest;

        test          |      timestamp      |	date
----------------------+---------------------+------------
J2345678 04:05:06.789   1710-02-23 04:05:06	1710-02-23
```

If the source data for a DATE column includes time information, the time component is truncated. If the source data for a TIMESTAMP column omits time information, 00:00:00 is used for the time component.

# COPY examples
<a name="r_COPY_command_examples"></a>

**Note**  
These examples contain line breaks for readability. Do not include line breaks or spaces in your *credentials-args* string.

**Topics**
+ [Load FAVORITEMOVIES from an DynamoDB table](#r_COPY_command_examples-load-favoritemovies-from-an-amazon-dynamodb-table)
+ [Load LISTING from an Amazon S3 bucket](#r_COPY_command_examples-load-listing-from-an-amazon-s3-bucket)
+ [Load LISTING from an Amazon EMR cluster](#copy-command-examples-emr)
+ [Example: COPY from Amazon S3 using a manifest](#copy-command-examples-manifest)
+ [Load LISTING from a pipe-delimited file (default delimiter)](#r_COPY_command_examples-load-listing-from-a-pipe-delimited-file-default-delimiter)
+ [Load LISTING using columnar data in Parquet format](#r_COPY_command_examples-load-listing-from-parquet)
+ [Load LISTING using columnar data in ORC format](#r_COPY_command_examples-load-listing-from-orc)
+ [Load EVENT with options](#r_COPY_command_examples-load-event-with-options)
+ [Load VENUE from a fixed-width data file](#r_COPY_command_examples-load-venue-from-a-fixed-width-data-file)
+ [Load CATEGORY from a CSV file](#load-from-csv)
+ [Load VENUE with explicit values for an IDENTITY column](#r_COPY_command_examples-load-venue-with-explicit-values-for-an-identity-column)
+ [Load TIME from a pipe-delimited GZIP file](#r_COPY_command_examples-load-time-from-a-pipe-delimited-gzip-file)
+ [Load a timestamp or datestamp](#r_COPY_command_examples-load-a-time-datestamp)
+ [Load data from a file with default values](#r_COPY_command_examples-load-data-from-a-file-with-default-values)
+ [COPY data with the ESCAPE option](#r_COPY_command_examples-copy-data-with-the-escape-option)
+ [Copy from JSON examples](#r_COPY_command_examples-copy-from-json)
+ [Copy from Avro examples](#r_COPY_command_examples-copy-from-avro)
+ [Preparing files for COPY with the ESCAPE option](#r_COPY_preparing_data)
+ [Loading a shapefile into Amazon Redshift](#copy-example-spatial-copy-shapefile)
+ [COPY command with the NOLOAD option](#r_COPY_command_examples-load-noload-option)
+ [COPY command with a multibyte delimiter and the ENCODING option](#r_COPY_command_examples-load-encoding-multibyte-delimiter-option)

## Load FAVORITEMOVIES from an DynamoDB table
<a name="r_COPY_command_examples-load-favoritemovies-from-an-amazon-dynamodb-table"></a>

The AWS SDKs include a simple example of creating a DynamoDB table called *Movies*. (For this example, see [Getting Started with DynamoDB](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.html).) The following example loads the Amazon Redshift MOVIES table with data from the DynamoDB table. The Amazon Redshift table must already exist in the database.

```
copy favoritemovies from 'dynamodb://Movies'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
readratio 50;
```

## Load LISTING from an Amazon S3 bucket
<a name="r_COPY_command_examples-load-listing-from-an-amazon-s3-bucket"></a>

The following example loads LISTING from an Amazon S3 bucket. The COPY command loads all of the files in the `/data/listing/` folder.

```
copy listing
from 's3://amzn-s3-demo-bucket/data/listing/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Load LISTING from an Amazon EMR cluster
<a name="copy-command-examples-emr"></a>

The following example loads the SALES table with tab-delimited data from lzop-compressed files in an Amazon EMR cluster. COPY loads every file in the `myoutput/` folder that begins with `part-`.

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/part-*' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '\t' lzop;
```

The following example loads the SALES table with JSON formatted data in an Amazon EMR cluster. COPY loads every file in the `myoutput/json/` folder.

```
copy sales
from 'emr://j-SAMPLE2B500FC/myoutput/json/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
JSON 's3://amzn-s3-demo-bucket/jsonpaths.txt';
```

## Using a manifest to specify data files
<a name="copy-command-examples-manifest"></a>

You can use a manifest to ensure that your COPY command loads all of the required files, and only the required files, from Amazon S3. You can also use a manifest when you need to load multiple files from different buckets or files that don't share the same prefix. 

For example, suppose that you need to load the following three files: `custdata1.txt`, `custdata2.txt`, and `custdata3.txt`. You could use the following command to load all of the files in `amzn-s3-demo-bucket` that begin with `custdata` by specifying a prefix: 

```
copy category
from 's3://amzn-s3-demo-bucket/custdata' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

If only two of the files exist because of an error, COPY loads only those two files and finishes successfully, resulting in an incomplete data load. If the bucket also contains an unwanted file that happens to use the same prefix, such as a file named `custdata.backup` for example, COPY loads that file as well, resulting in unwanted data being loaded.

To ensure that all of the required files are loaded and to prevent unwanted files from being loaded, you can use a manifest file. The manifest is a JSON-formatted text file that lists the files to be processed by the COPY command. For example, the following manifest loads the three files in the previous example.

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.1",
         "mandatory":true
      },
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.2",
         "mandatory":true
      },
      {  
         "url":"s3://amzn-s3-demo-bucket/custdata.3",
         "mandatory":true
      }
   ]
}
```

The optional `mandatory` flag indicates whether COPY should terminate if the file doesn't exist. The default is `false`. Regardless of any mandatory settings, COPY terminates if no files are found. In this example, COPY returns an error if any of the files isn't found. Unwanted files that might have been picked up if you specified only a key prefix, such as `custdata.backup`, are ignored, because they aren't on the manifest. 

When loading from data files in ORC or Parquet format, a `meta` field is required, as shown in the following example.

```
{  
   "entries":[  
      {  
         "url":"s3://amzn-s3-demo-bucket1/orc/2013-10-04-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      },
      {  
         "url":"s3://amzn-s3-demo-bucket2/orc/2013-10-05-custdata",
         "mandatory":true,
         "meta":{  
            "content_length":99
         }
      }
   ]
}
```

The following example uses a manifest named `cust.manifest`. 

```
copy customer
from 's3://amzn-s3-demo-bucket/cust.manifest' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as orc
manifest;
```

You can use a manifest to load files from different buckets or files that don't share the same prefix. The following example shows the JSON to load data with files whose names begin with a date stamp.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/2013-10-04-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-05-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-06-custdata.txt","mandatory":true},
    {"url":"s3://amzn-s3-demo-bucket/2013-10-07-custdata.txt","mandatory":true}
  ]
}
```

The manifest can list files that are in different buckets, as long as the buckets are in the same AWS Region as the cluster. 

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata1.txt","mandatory":false},
    {"url":"s3://amzn-s3-demo-bucket2/custdata1.txt","mandatory":false},
    {"url":"s3://amzn-s3-demo-bucket2/custdata2.txt","mandatory":false}
  ]
}
```

## Load LISTING from a pipe-delimited file (default delimiter)
<a name="r_COPY_command_examples-load-listing-from-a-pipe-delimited-file-default-delimiter"></a>

The following example is a very simple case in which no options are specified and the input file contains the default delimiter, a pipe character ('\$1'). 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

## Load LISTING using columnar data in Parquet format
<a name="r_COPY_command_examples-load-listing-from-parquet"></a>

The following example loads data from a folder on Amazon S3 named parquet. 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings/parquet/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as parquet;
```

## Load LISTING using columnar data in ORC format
<a name="r_COPY_command_examples-load-listing-from-orc"></a>

The following example loads data from a folder on Amazon S3 named `orc`. 

```
copy listing 
from 's3://amzn-s3-demo-bucket/data/listings/orc/' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as orc;
```

## Load EVENT with options
<a name="r_COPY_command_examples-load-event-with-options"></a>

The following example loads pipe-delimited data into the EVENT table and applies the following rules: 
+ If pairs of quotation marks are used to surround any character strings, they are removed.
+ Both empty strings and strings that contain blanks are loaded as NULL values.
+ The load fails if more than 5 errors are returned.
+ Timestamp values must comply with the specified format; for example, a valid timestamp is `2008-09-26 05:43:12`.

```
copy event
from 's3://amzn-s3-demo-bucket/data/allevents_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
removequotes
emptyasnull
blanksasnull
maxerror 5
delimiter '|'
timeformat 'YYYY-MM-DD HH:MI:SS';
```

## Load VENUE from a fixed-width data file
<a name="r_COPY_command_examples-load-venue-from-a-fixed-width-data-file"></a>

```
copy venue
from 's3://amzn-s3-demo-bucket/data/venue_fw.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
fixedwidth 'venueid:3,venuename:25,venuecity:12,venuestate:2,venueseats:6';
```

The preceding example assumes a data file formatted in the same way as the sample data shown. In the sample following, spaces act as placeholders so that all of the columns are the same width as noted in the specification: 

```
1  Toyota Park              Bridgeview  IL0
2  Columbus Crew Stadium    Columbus    OH0
3  RFK Stadium              Washington  DC0
4  CommunityAmerica BallparkKansas City KS0
5  Gillette Stadium         Foxborough  MA68756
```

## Load CATEGORY from a CSV file
<a name="load-from-csv"></a>

Suppose you want to load the CATEGORY with the values shown in the following table.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_examples.html)

The following example shows the contents of a text file with the field values separated by commas.

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,All "non-musical" theatre  
14,Shows,Opera,All opera, light, and "rock" opera
15,Concerts,Classical,All symphony, concerto, and choir concerts
```

If you load the file using the DELIMITER parameter to specify comma-delimited input, the COPY command fails because some input fields contain commas. You can avoid that problem by using the CSV parameter and enclosing the fields that contain commas in quotation mark characters. If the quotation mark character appears within a quoted string, you need to escape it by doubling the quotation mark character. The default quotation mark character is a double quotation mark, so you need to escape each double quotation mark with an additional double quotation mark. Your new input file looks something like this. 

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,"All ""non-musical"" theatre"
14,Shows,Opera,"All opera, light, and ""rock"" opera"
15,Concerts,Classical,"All symphony, concerto, and choir concerts"
```

Assuming the file name is `category_csv.txt`, you can load the file by using the following COPY command:

```
copy category
from 's3://amzn-s3-demo-bucket/data/category_csv.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
csv;
```

Alternatively, to avoid the need to escape the double quotation marks in your input, you can specify a different quotation mark character by using the QUOTE AS parameter. For example, the following version of `category_csv.txt` uses '`%`' as the quotation mark character.

```
12,Shows,Musicals,Musical theatre
13,Shows,Plays,%All "non-musical" theatre%
14,Shows,Opera,%All opera, light, and "rock" opera%
15,Concerts,Classical,%All symphony, concerto, and choir concerts%
```

The following COPY command uses QUOTE AS to load `category_csv.txt`:

```
copy category
from 's3://amzn-s3-demo-bucket/data/category_csv.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
csv quote as '%';
```

## Load VENUE with explicit values for an IDENTITY column
<a name="r_COPY_command_examples-load-venue-with-explicit-values-for-an-identity-column"></a>

The following example assumes that when the VENUE table was created that at least one column (such as the `venueid` column) was specified to be an IDENTITY column. This command overrides the default IDENTITY behavior of autogenerating values for an IDENTITY column and instead loads the explicit values from the venue.txt file. Amazon Redshift does not check if duplicate IDENTITY values are loaded into the table when using the EXLICIT\$1IDS option. 

```
copy venue
from 's3://amzn-s3-demo-bucket/data/venue.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
explicit_ids;
```

## Load TIME from a pipe-delimited GZIP file
<a name="r_COPY_command_examples-load-time-from-a-pipe-delimited-gzip-file"></a>

The following example loads the TIME table from a pipe-delimited GZIP file:

```
copy time
from 's3://amzn-s3-demo-bucket/data/timerows.gz' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
gzip
delimiter '|';
```

## Load a timestamp or datestamp
<a name="r_COPY_command_examples-load-a-time-datestamp"></a>

The following example loads data with a formatted timestamp.

**Note**  
The TIMEFORMAT of `HH:MI:SS` can also support fractional seconds beyond the `SS` to a microsecond level of detail. The file `time.txt` used in this example contains one row, `2009-01-12 14:15:57.119568`.

```
copy timestamp1 
from 's3://amzn-s3-demo-bucket/data/time.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
timeformat 'YYYY-MM-DD HH:MI:SS';
```

The result of this copy is as follows: 

```
select * from timestamp1;
c1
----------------------------
2009-01-12 14:15:57.119568
(1 row)
```

## Load data from a file with default values
<a name="r_COPY_command_examples-load-data-from-a-file-with-default-values"></a>

The following example uses a variation of the VENUE table in the TICKIT database. Consider a VENUE\$1NEW table defined with the following statement: 

```
create table venue_new(
venueid smallint not null,
venuename varchar(100) not null,
venuecity varchar(30),
venuestate char(2),
venueseats integer not null default '1000');
```

Consider a venue\$1noseats.txt data file that contains no values for the VENUESEATS column, as shown in the following example: 

```
1|Toyota Park|Bridgeview|IL|
2|Columbus Crew Stadium|Columbus|OH|
3|RFK Stadium|Washington|DC|
4|CommunityAmerica Ballpark|Kansas City|KS|
5|Gillette Stadium|Foxborough|MA|
6|New York Giants Stadium|East Rutherford|NJ|
7|BMO Field|Toronto|ON|
8|The Home Depot Center|Carson|CA|
9|Dick's Sporting Goods Park|Commerce City|CO|
10|Pizza Hut Park|Frisco|TX|
```

The following COPY statement will successfully load the table from the file and apply the DEFAULT value ('1000') to the omitted column: 

```
copy venue_new(venueid, venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_noseats.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

Now view the loaded table: 

```
select * from venue_new order by venueid;
venueid |         venuename          |    venuecity    | venuestate | venueseats
---------+----------------------------+-----------------+------------+------------
1 | Toyota Park                | Bridgeview      | IL         |       1000
2 | Columbus Crew Stadium      | Columbus        | OH         |       1000
3 | RFK Stadium                | Washington      | DC         |       1000
4 | CommunityAmerica Ballpark  | Kansas City     | KS         |       1000
5 | Gillette Stadium           | Foxborough      | MA         |       1000
6 | New York Giants Stadium    | East Rutherford | NJ         |       1000
7 | BMO Field                  | Toronto         | ON         |       1000
8 | The Home Depot Center      | Carson          | CA         |       1000
9 | Dick's Sporting Goods Park | Commerce City   | CO         |       1000
10 | Pizza Hut Park             | Frisco          | TX         |       1000
(10 rows)
```

For the following example, in addition to assuming that no VENUESEATS data is included in the file, also assume that no VENUENAME data is included: 

```
1||Bridgeview|IL|
2||Columbus|OH|
3||Washington|DC|
4||Kansas City|KS|
5||Foxborough|MA|
6||East Rutherford|NJ|
7||Toronto|ON|
8||Carson|CA|
9||Commerce City|CO|
10||Frisco|TX|
```

 Using the same table definition, the following COPY statement fails because no DEFAULT value was specified for VENUENAME, and VENUENAME is a NOT NULL column: 

```
copy venue(venueid, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

Now consider a variation of the VENUE table that uses an IDENTITY column: 

```
create table venue_identity(
venueid int identity(1,1),
venuename varchar(100) not null,
venuecity varchar(30),
venuestate char(2),
venueseats integer not null default '1000');
```

As with the previous example, assume that the VENUESEATS column has no corresponding values in the source file. The following COPY statement successfully loads the table, including the predefined IDENTITY data values instead of autogenerating those values: 

```
copy venue(venueid, venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|' explicit_ids;
```

This statement fails because it doesn't include the IDENTITY column (VENUEID is missing from the column list) yet includes an EXPLICIT\$1IDS parameter: 

```
copy venue(venuename, venuecity, venuestate) 
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|' explicit_ids;
```

This statement fails because it doesn't include an EXPLICIT\$1IDS parameter: 

```
copy venue(venueid, venuename, venuecity, venuestate)
from 's3://amzn-s3-demo-bucket/data/venue_pipe.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter '|';
```

## COPY data with the ESCAPE option
<a name="r_COPY_command_examples-copy-data-with-the-escape-option"></a>

The following example shows how to load characters that match the delimiter character (in this case, the pipe character). In the input file, make sure that all of the pipe characters (\$1) that you want to load are escaped with the backslash character (\$1). Then load the file with the ESCAPE parameter. 

```
$ more redshiftinfo.txt
1|public\|event\|dwuser
2|public\|sales\|dwuser

create table redshiftinfo(infoid int,tableinfo varchar(50));

copy redshiftinfo from 's3://amzn-s3-demo-bucket/data/redshiftinfo.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
delimiter '|' escape;

select * from redshiftinfo order by 1;
infoid |       tableinfo
-------+--------------------
1      | public|event|dwuser
2      | public|sales|dwuser
(2 rows)
```

Without the ESCAPE parameter, this COPY command fails with an `Extra column(s) found` error.

**Important**  
If you load your data using a COPY with the ESCAPE parameter, you must also specify the ESCAPE parameter with your UNLOAD command to generate the reciprocal output file. Similarly, if you UNLOAD using the ESCAPE parameter, you need to use ESCAPE when you COPY the same data.

## Copy from JSON examples
<a name="r_COPY_command_examples-copy-from-json"></a>

In the following examples, you load the CATEGORY table with the following data. 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_examples.html)

**Topics**
+ [Load from JSON data using the 'auto' option](#copy-from-json-examples-using-auto)
+ [Load from JSON data using the 'auto ignorecase' option](#copy-from-json-examples-using-auto-ignorecase)
+ [Load from JSON data using a JSONPaths file](#copy-from-json-examples-using-jsonpaths)
+ [Load from JSON arrays using a JSONPaths file](#copy-from-json-examples-using-jsonpaths-arrays)

### Load from JSON data using the 'auto' option
<a name="copy-from-json-examples-using-auto"></a>

To load from JSON data using the `'auto'` option, the JSON data must consist of a set of objects. The key names must match the column names, but the order doesn't matter. The following shows the contents of a file named `category_object_auto.json`.

```
{
    "catdesc": "Major League Baseball",
    "catid": 1,
    "catgroup": "Sports",
    "catname": "MLB"
}
{
    "catgroup": "Sports",
    "catid": 2,
    "catname": "NHL",
    "catdesc": "National Hockey League"
}
{
    "catid": 3,
    "catname": "NFL",
    "catgroup": "Sports",
    "catdesc": "National Football League"
}
{
    "bogus": "Bogus Sports LLC",
    "catid": 4,
    "catgroup": "Sports",
    "catname": "NBA",
    "catdesc": "National Basketball Association"
}
{
    "catid": 5,
    "catgroup": "Shows",
    "catname": "Musicals",
    "catdesc": "All symphony, concerto, and choir concerts"
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_auto.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 'auto';
```

### Load from JSON data using the 'auto ignorecase' option
<a name="copy-from-json-examples-using-auto-ignorecase"></a>

To load from JSON data using the `'auto ignorecase'` option, the JSON data must consist of a set of objects. The case of the key names doesn't have to match the column names and the order doesn't matter. The following shows the contents of a file named `category_object_auto-ignorecase.json`.

```
{
    "CatDesc": "Major League Baseball",
    "CatID": 1,
    "CatGroup": "Sports",
    "CatName": "MLB"
}
{
    "CatGroup": "Sports",
    "CatID": 2,
    "CatName": "NHL",
    "CatDesc": "National Hockey League"
}
{
    "CatID": 3,
    "CatName": "NFL",
    "CatGroup": "Sports",
    "CatDesc": "National Football League"
}
{
    "bogus": "Bogus Sports LLC",
    "CatID": 4,
    "CatGroup": "Sports",
    "CatName": "NBA",
    "CatDesc": "National Basketball Association"
}
{
    "CatID": 5,
    "CatGroup": "Shows",
    "CatName": "Musicals",
    "CatDesc": "All symphony, concerto, and choir concerts"
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_auto ignorecase.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 'auto ignorecase';
```

### Load from JSON data using a JSONPaths file
<a name="copy-from-json-examples-using-jsonpaths"></a>

If the JSON data objects don't correspond directly to column names, you can use a JSONPaths file to map the JSON elements to columns. The order doesn't matter in the JSON source data, but the order of the JSONPaths file expressions must match the column order. Suppose that you have the following data file, named `category_object_paths.json`.

```
{
    "one": 1,
    "two": "Sports",
    "three": "MLB",
    "four": "Major League Baseball"
}
{
    "three": "NHL",
    "four": "National Hockey League",
    "one": 2,
    "two": "Sports"
}
{
    "two": "Sports",
    "three": "NFL",
    "one": 3,
    "four": "National Football League"
}
{
    "one": 4,
    "two": "Sports",
    "three": "NBA",
    "four": "National Basketball Association"
}
{
    "one": 6,
    "two": "Shows",
    "three": "Musicals",
    "four": "All symphony, concerto, and choir concerts"
}
```

The following JSONPaths file, named `category_jsonpath.json`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$['one']",
        "$['two']",
        "$['three']",
        "$['four']"
    ]
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_paths.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 's3://amzn-s3-demo-bucket/category_jsonpath.json';
```

### Load from JSON arrays using a JSONPaths file
<a name="copy-from-json-examples-using-jsonpaths-arrays"></a>

To load from JSON data that consists of a set of arrays, you must use a JSONPaths file to map the array elements to columns. Suppose that you have the following data file, named `category_array_data.json`.

```
[1,"Sports","MLB","Major League Baseball"]
[2,"Sports","NHL","National Hockey League"]
[3,"Sports","NFL","National Football League"]
[4,"Sports","NBA","National Basketball Association"]
[5,"Concerts","Classical","All symphony, concerto, and choir concerts"]
```

The following JSONPaths file, named `category_array_jsonpath.json`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$[0]",
        "$[1]",
        "$[2]",
        "$[3]"
    ]
}
```

To load from the JSON data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_array_data.json'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
json 's3://amzn-s3-demo-bucket/category_array_jsonpath.json';
```

## Copy from Avro examples
<a name="r_COPY_command_examples-copy-from-avro"></a>

In the following examples, you load the CATEGORY table with the following data. 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_COPY_command_examples.html)

**Topics**
+ [Load from Avro data using the 'auto' option](#copy-from-avro-examples-using-auto)
+ [Load from Avro data using the 'auto ignorecase' option](#copy-from-avro-examples-using-auto-ignorecase)
+ [Load from Avro data using a JSONPaths file](#copy-from-avro-examples-using-avropaths)

### Load from Avro data using the 'auto' option
<a name="copy-from-avro-examples-using-auto"></a>

To load from Avro data using the `'auto'` argument, field names in the Avro schema must match the column names. When using the `'auto'` argument, order doesn't matter. The following shows the schema for a file named `category_auto.avro`.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "catid", "type": "int"},
        {"name": "catdesc", "type": "string"},
        {"name": "catname", "type": "string"},
        {"name": "catgroup", "type": "string"},
}
```

The data in an Avro file is in binary format, so it isn't human-readable. The following shows a JSON representation of the data in the `category_auto.avro` file. 

```
{
   "catid": 1,
   "catdesc": "Major League Baseball",
   "catname": "MLB",
   "catgroup": "Sports"
}
{
   "catid": 2,
   "catdesc": "National Hockey League",
   "catname": "NHL",
   "catgroup": "Sports"
}
{
   "catid": 3,
   "catdesc": "National Basketball Association",
   "catname": "NBA",
   "catgroup": "Sports"
}
{
   "catid": 4,
   "catdesc": "All symphony, concerto, and choir concerts",
   "catname": "Classical",
   "catgroup": "Concerts"
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_auto.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as avro 'auto';
```

### Load from Avro data using the 'auto ignorecase' option
<a name="copy-from-avro-examples-using-auto-ignorecase"></a>

To load from Avro data using the `'auto ignorecase'` argument, the case of the field names in the Avro schema does not have to match the case of column names. When using the `'auto ignorecase'` argument, order doesn't matter. The following shows the schema for a file named `category_auto-ignorecase.avro`.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "CatID", "type": "int"},
        {"name": "CatDesc", "type": "string"},
        {"name": "CatName", "type": "string"},
        {"name": "CatGroup", "type": "string"},
}
```

The data in an Avro file is in binary format, so it isn't human-readable. The following shows a JSON representation of the data in the `category_auto-ignorecase.avro` file. 

```
{
   "CatID": 1,
   "CatDesc": "Major League Baseball",
   "CatName": "MLB",
   "CatGroup": "Sports"
}
{
   "CatID": 2,
   "CatDesc": "National Hockey League",
   "CatName": "NHL",
   "CatGroup": "Sports"
}
{
   "CatID": 3,
   "CatDesc": "National Basketball Association",
   "CatName": "NBA",
   "CatGroup": "Sports"
}
{
   "CatID": 4,
   "CatDesc": "All symphony, concerto, and choir concerts",
   "CatName": "Classical",
   "CatGroup": "Concerts"
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_auto-ignorecase.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
format as avro 'auto ignorecase';
```

### Load from Avro data using a JSONPaths file
<a name="copy-from-avro-examples-using-avropaths"></a>

If the field names in the Avro schema don't correspond directly to column names, you can use a JSONPaths file to map the schema elements to columns. The order of the JSONPaths file expressions must match the column order. 

Suppose that you have a data file named `category_paths.avro` that contains the same data as in the previous example, but with the following schema.

```
{
    "name": "category",
    "type": "record",
    "fields": [
        {"name": "id", "type": "int"},
        {"name": "desc", "type": "string"},
        {"name": "name", "type": "string"},
        {"name": "group", "type": "string"},
        {"name": "region", "type": "string"} 
     ]
}
```

The following JSONPaths file, named `category_path.avropath`, maps the source data to the table columns.

```
{
    "jsonpaths": [
        "$['id']",
        "$['group']",
        "$['name']",
        "$['desc']"
    ]
}
```

To load from the Avro data file in the previous example, run the following COPY command.

```
copy category
from 's3://amzn-s3-demo-bucket/category_object_paths.avro'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' 
format avro 's3://amzn-s3-demo-bucket/category_path.avropath ';
```

## Preparing files for COPY with the ESCAPE option
<a name="r_COPY_preparing_data"></a>

The following example describes how you might prepare data to "escape" newline characters before importing the data into an Amazon Redshift table using the COPY command with the ESCAPE parameter. Without preparing the data to delimit the newline characters, Amazon Redshift returns load errors when you run the COPY command, because the newline character is normally used as a record separator. 

For example, consider a file or a column in an external table that you want to copy into an Amazon Redshift table. If the file or column contains XML-formatted content or similar data, you need to make sure that all of the newline characters (\$1n) that are part of the content are escaped with the backslash character (\$1). 

A file or table containing embedded newlines characters provides a relatively easy pattern to match. Each embedded newline character most likely always follows a `>` character with potentially some white space characters (`' '` or tab) in between, as you can see in the following example of a text file named `nlTest1.txt`. 

```
$ cat nlTest1.txt
<xml start>
<newline characters provide>
<line breaks at the end of each>
<line in content>
</xml>|1000
<xml>
</xml>|2000
```

With the following example, you can run a text-processing utility to pre-process the source file and insert escape characters where needed. (The `|` character is intended to be used as delimiter to separate column data when copied into an Amazon Redshift table.) 

```
$ sed -e ':a;N;$!ba;s/>[[:space:]]*\n/>\\\n/g' nlTest1.txt > nlTest2.txt
```

Similarly, you can use Perl to perform a similar operation: 

```
cat nlTest1.txt | perl -p -e 's/>\s*\n/>\\\n/g' > nlTest2.txt
```

To accommodate loading the data from the `nlTest2.txt` file into Amazon Redshift, we created a two-column table in Amazon Redshift. The first column c1, is a character column that holds XML-formatted content from the `nlTest2.txt` file. The second column c2 holds integer values loaded from the same file. 

After running the `sed` command, you can correctly load data from the `nlTest2.txt` file into an Amazon Redshift table using the ESCAPE parameter. 

**Note**  
When you include the ESCAPE parameter with the COPY command, it escapes a number of special characters that include the backslash character (including newline). 

```
copy t2 from 's3://amzn-s3-demo-bucket/data/nlTest2.txt' 
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'  
escape
delimiter as '|';

select * from t2 order by 2;

c1           |  c2
-------------+------
<xml start>
<newline characters provide>
<line breaks at the end of each>
<line in content>
</xml>
| 1000
<xml>
</xml>       | 2000
(2 rows)
```

You can prepare data files exported from external databases in a similar way. For example, with an Oracle database, you can use the REPLACE function on each affected column in a table that you want to copy into Amazon Redshift. 

```
SELECT c1, REPLACE(c2, \n',\\n' ) as c2 from my_table_with_xml
```

In addition, many database export and extract, transform, load (ETL) tools that routinely process large amounts of data provide options to specify escape and delimiter characters. 

## Loading a shapefile into Amazon Redshift
<a name="copy-example-spatial-copy-shapefile"></a>

The following examples demonstrate how to load an Esri shapefile using COPY. For more information about loading shapefiles, see [Loading a shapefile into Amazon Redshift](spatial-copy-shapefile.md). 

### Loading a shapefile
<a name="copy-example-spatial-copy-shapefile-loading-copy"></a>

The following steps show how to ingest OpenStreetMap data from Amazon S3 using the COPY command. This example assumes that the Norway shapefile archive from [the download site of Geofabrik](https://download.geofabrik.de/europe.html) has been uploaded to a private Amazon S3 bucket in your AWS Region. The `.shp`, `.shx`, and `.dbf` files must share the same Amazon S3 prefix and file name.

#### Ingesting data without simplification
<a name="spatial-copy-shapefile-loading-copy-fits"></a>

The following commands create tables and ingest data that can fit in the maximum geometry size without any simplification. Open the `gis_osm_natural_free_1.shp` in your preferred GIS software and inspect the columns in this layer. By default, either IDENTITY or GEOMETRY columns are first. When a GEOMETRY column is first, you can create the table as shown following.

```
CREATE TABLE norway_natural (
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Or, when an IDENTITY column is first, you can create the table as shown following.

```
CREATE TABLE norway_natural_with_id (
   fid INT IDENTITY(1,1),
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Now you can ingest the data using COPY.

```
COPY norway_natural FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO: Load into table 'norway_natural' completed, 83891 record(s) loaded successfully
```

Or you can ingest the data as shown following. 

```
COPY norway_natural_with_id FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO: Load into table 'norway_natural_with_id' completed, 83891 record(s) loaded successfully.
```

#### Ingesting data with simplification
<a name="spatial-copy-shapefile-loading-copy-no-fit"></a>

The following commands create a table and try to ingest data that can't fit in the maximum geometry size without any simplification. Inspect the `gis_osm_water_a_free_1.shp` shapefile and create the appropriate table as shown following.

```
CREATE TABLE norway_water (
   wkb_geometry GEOMETRY,
   osm_id BIGINT,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

When the COPY command runs, it results in an error.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
ERROR:  Load into table 'norway_water' failed.  Check 'stl_load_errors' system table for details.
```

Querying `STL_LOAD_ERRORS` shows that the geometry is too large. 

```
SELECT line_number, btrim(colname), btrim(err_reason) FROM stl_load_errors WHERE query = pg_last_copy_id();
 line_number |    btrim     |                                 btrim
-------------+--------------+-----------------------------------------------------------------------
     1184705 | wkb_geometry | Geometry size: 1513736 is larger than maximum supported size: 1048447
```

To overcome this, the `SIMPLIFY AUTO` parameter is added to the COPY command to simplify geometries.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
SIMPLIFY AUTO
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';

INFO:  Load into table 'norway_water' completed, 1989196 record(s) loaded successfully.
```

To view the rows and geometries that were simplified, query `SVL_SPATIAL_SIMPLIFY`.

```
SELECT * FROM svl_spatial_simplify WHERE query = pg_last_copy_id();
 query | line_number | maximum_tolerance | initial_size | simplified | final_size |   final_tolerance
-------+-------------+-------------------+--------------+------------+------------+----------------------
    20 |     1184704 |                -1 |      1513736 | t          |    1008808 |   1.276386653895e-05
    20 |     1664115 |                -1 |      1233456 | t          |    1023584 | 6.11707814796635e-06
```

Using SIMPLIFY AUTO *max\$1tolerance* with the tolerance lower than the automatically calculated ones probably results in an ingestion error. In this case, use MAXERROR to ignore errors.

```
COPY norway_water FROM 's3://bucket_name/shapefiles/norway/gis_osm_water_a_free_1.shp'
FORMAT SHAPEFILE
SIMPLIFY AUTO 1.1E-05
MAXERROR 2
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';

INFO:  Load into table 'norway_water' completed, 1989195 record(s) loaded successfully.
INFO:  Load into table 'norway_water' completed, 1 record(s) could not be loaded.  Check 'stl_load_errors' system table for details.
```

Query `SVL_SPATIAL_SIMPLIFY` again to identify the record that COPY didn't manage to load.

```
SELECT * FROM svl_spatial_simplify WHERE query = pg_last_copy_id();
 query | line_number | maximum_tolerance | initial_size | simplified | final_size | final_tolerance
-------+-------------+-------------------+--------------+------------+------------+-----------------
    29 |     1184704 |           1.1e-05 |      1513736 | f          |          0 |               0
    29 |     1664115 |           1.1e-05 |      1233456 | t          |     794432 |         1.1e-05
```

In this example, the first record didn’t manage to fit, so the `simplified` column is showing false. The second record was loaded within the given tolerance. However, the final size is larger than using the automatically calculated tolerance without specifying the maximum tolerance. 

### Loading from a compressed shapefile
<a name="copy-example-spatial-copy-shapefile-compressed"></a>

Amazon Redshift COPY supports ingesting data from a compressed shapefile. All shapefile components must have the same Amazon S3 prefix and the same compression suffix. As an example, suppose that you want to load the data from the previous example. In this case, the files `gis_osm_water_a_free_1.shp.gz`, `gis_osm_water_a_free_1.dbf.gz`, and `gis_osm_water_a_free_1.shx.gz` must share the same Amazon S3 directory. The COPY command requires the GZIP option, and the FROM clause must specify the correct compressed file, as shown following.

```
COPY norway_natural FROM 's3://bucket_name/shapefiles/norway/compressed/gis_osm_natural_free_1.shp.gz'
FORMAT SHAPEFILE
GZIP
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO:  Load into table 'norway_natural' completed, 83891 record(s) loaded successfully.
```

### Loading data into a table with a different column order
<a name="copy-example-spatial-copy-shapefile-column-order"></a>

If you have a table that doesn't have `GEOMETRY` as the first column, you can use column mapping to map columns to the target table. For example, create a table with `osm_id` specified as a first column.

```
CREATE TABLE norway_natural_order (
   osm_id BIGINT,
   wkb_geometry GEOMETRY,
   code INT,
   fclass VARCHAR,
   name VARCHAR);
```

Then ingest a shapefile using column mapping.

```
COPY norway_natural_order(wkb_geometry, osm_id, code, fclass, name) 
FROM 's3://bucket_name/shapefiles/norway/gis_osm_natural_free_1.shp'
FORMAT SHAPEFILE
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/MyRoleName';
INFO:  Load into table 'norway_natural_order' completed, 83891 record(s) loaded successfully.
```

### Loading data into a table with a geography column
<a name="copy-example-spatial-copy-shapefile-geography"></a>

If you have a table that has a `GEOGRAPHY` column, you first ingest into a `GEOMETRY` column and then cast the objects to `GEOGRAPHY` objects. For example, after you copy your shapefile into a `GEOMETRY` column, alter the table to add a column of the `GEOGRAPHY` data type.

```
ALTER TABLE norway_natural ADD COLUMN wkb_geography GEOGRAPHY;
```

Then convert geometries to geographies.

```
UPDATE norway_natural SET wkb_geography = wkb_geometry::geography;
```

Optionally, you can drop the `GEOMETRY` column.

```
ALTER TABLE norway_natural DROP COLUMN wkb_geometry;
```

## COPY command with the NOLOAD option
<a name="r_COPY_command_examples-load-noload-option"></a>

To validate data files before you actually load the data, use the NOLOAD option with the COPY command. Amazon Redshift parses the input file and displays any errors that occur. The following example uses the NOLOAD option and no rows are actually loaded into the table.

```
COPY public.zipcode1
FROM 's3://amzn-s3-demo-bucket/mydata/zipcode.csv' 
DELIMITER ';' 
IGNOREHEADER 1 REGION 'us-east-1'
NOLOAD
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/myRedshiftRole';

Warnings:
Load into table 'zipcode1' completed, 0 record(s) loaded successfully.
```

## COPY command with a multibyte delimiter and the ENCODING option
<a name="r_COPY_command_examples-load-encoding-multibyte-delimiter-option"></a>

The following example loads LATIN1 from an Amazon S3 file that contains multibyte data. The COPY command specifies the delimiter in octal form `\302\246\303\254` to separate the fields in the input file which is encoded as ISO-8859-1. To specify the same delimiter in UTF-8, specify `DELIMITER '¦ì'`.

```
COPY latin1
FROM 's3://amzn-s3-demo-bucket/multibyte/myfile' 
IAM_ROLE 'arn:aws:iam::123456789012:role/myRedshiftRole'
DELIMITER '\302\246\303\254'
ENCODING ISO88591
```

# CREATE DATABASE
<a name="r_CREATE_DATABASE"></a>

Creates a new database.

To create a database, you must be a superuser or have the CREATEDB privilege. To create a database associated with a zero-ETL integration, you must be a superuser or have both CREATEDB and CREATEUSER privileges.

You can't run CREATE DATABASE within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

## Syntax
<a name="r_CREATE_DATABASE-synopsis"></a>

```
CREATE DATABASE database_name 
[ { [ 
      FROM INTEGRATION '<integration_id>'[ DATABASE '<source_database>' ]
      [ SET ]
      [ ACCEPTINVCHARS [=] { TRUE | FALSE }]
      [ QUERY_ALL_STATES [=] { TRUE | FALSE }] 
      [ REFRESH_INTERVAL <interval> ] 
      [ TRUNCATECOLUMNS [=] { TRUE | FALSE } ]
      [ HISTORY_MODE [=] {TRUE | FALSE} ]
    ]
    [ WITH ]
    [ OWNER [=] db_owner ]
    [ CONNECTION LIMIT { limit | UNLIMITED } ]
    [ COLLATE { CASE_SENSITIVE | CS | CASE_INSENSITIVE | CI } ]
    [ ISOLATION LEVEL { SNAPSHOT | SERIALIZABLE } ]
  }
  | { FROM { { ARN '<arn>' } { WITH DATA CATALOG SCHEMA '<schema>' | WITH NO DATA CATALOG SCHEMA } } }
  | { IAM_ROLE  {default | 'SESSION' | 'arn:aws:iam::<account-id>:role/<role-name>' } }
  | { [ WITH PERMISSIONS ] FROM DATASHARE datashare_name OF [ ACCOUNT account_id ] NAMESPACE namespace_guid }
]
```

## Parameters
<a name="r_CREATE_DATABASE-parameters"></a>

 *database\$1name*   
Name of the new database. For more information about valid names, see [Names and identifiers](r_names.md).

FROM INTEGRATION '<integration\$1id>' [ DATABASE '<source\$1database>' ]   
Specifies whether to create the database using a zero-ETL integration identifier. You can retrieve the `integration_id` from SVV\$1INTEGRATION system view. For Aurora PostgreSQL zero-ETL integrations, you also need to specify `source_database` name, which can also be retrieved from SVV\$1INTEGRATION.  
For an example, see [Create databases to receive results of zero-ETL integrations](#r_CREATE_DATABASE-integration). For more information about creating databases with zero-ETL integrations, see [Creating destination databases in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-using.creating-db.html) in the *Amazon Redshift Management Guide*.

SET  
Optional keyword.

ACCEPTINVCHARS [=] \$1 TRUE \$1 FALSE \$1  
The ACCEPTINVCHARS clause sets whether zero-ETL integration tables continue with ingestion when invalid characters are detected for the VARCHAR data type. When invalid characters are encountered, the invalid character is replaced with a default `?` character.

QUERY\$1ALL\$1STATES [=] \$1 TRUE \$1 FALSE \$1  
The QUERY\$1ALL\$1STATES clause sets whether zero-ETL integration tables can be queried in all states (`Synced`, `Failed`, `ResyncRequired`, and `ResyncInitiated`). By default, a zero-ETL integration table can only be queried in `Synced` state.

REFRESH\$1INTERVAL <interval>  
The REFRESH\$1INTERVAL clause sets the approximate time interval, in seconds, to refresh data from the zero-ETL source to the target database. The value can be set 0–432,000 seconds (5 days) for zero-ETL integrations whose source type is Aurora MySQL, Aurora PostgreSQL, or RDS for MySQL. For Amazon DynamoDB zero-ETL integrations, the value can be set 900–432,000 seconds (15 minutes –5 days). The default `interval` is zero (0) seconds for zero-ETL integrations whose source type is Aurora MySQL, Aurora PostgreSQL, or RDS for MySQL. For Amazon DynamoDB zero-ETL integrations, the default `interval` is 900 seconds (15 minutes).

TRUNCATECOLUMNS [=] \$1 TRUE \$1 FALSE \$1  
The TRUNCATECOLUMNS clause sets whether zero-ETL integration tables continue with ingestion when the values for the VARCHAR column or SUPER column attributes are beyond the limit. When `TRUE`, the values are truncated to fit into the column and the values of overflowing JSON attributes are truncated to fit into the SUPER column.

HISTORY\$1MODE [=] \$1TRUE \$1 FALSE\$1  
A clause that specifies whether Amazon Redshift will set history mode for all new tables in the specified database. This option is only applicable for databases created for zero-ETL integration.  
The HISTORY\$1MODE clause can be set to `TRUE` or `FALSE`. The default is `FALSE`. For information about HISTORY\$1MODE, see [History mode](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl-history-mode.html) in the *Amazon Redshift Management Guide*.

WITH  
Optional keyword.

OWNER [=] db\$1owner  
Specifies username of database owner.

CONNECTION LIMIT \$1 *limit* \$1 UNLIMITED \$1   
The maximum number of database connections users are permitted to have open concurrently. The limit isn't enforced for superusers. Use the UNLIMITED keyword to permit the maximum number of concurrent connections. A limit on the number of connections for each user might also apply. For more information, see [CREATE USER](r_CREATE_USER.md). The default is UNLIMITED. To view current connections, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view.  
If both user and database connection limits apply, an unused connection slot must be available that is within both limits when a user attempts to connect.

COLLATE \$1 CASE\$1SENSITIVE \$1 CS \$1 CASE\$1INSENSITIVE \$1 CI \$1  
A clause that specifies whether string search or comparison is case sensitive or case insensitive. The default is case sensitive.  
COLLATE is not supported when you create a database from a datashare.  
CASE\$1SENSITIVE and CS are interchangeable and yield the same results. Similarly, CASE\$1INSENSITIVE and CI are interchangeable and yield the same results.

ISOLATION LEVEL \$1 SNAPSHOT \$1 SERIALIZABLE \$1  
A clause that specifies the isolation level used when queries run against a database. For more information on isolation levels, see [Isolation levels in Amazon Redshift](c_serial_isolation.md).  
+ SNAPSHOT isolation – Provides an isolation level with protection against update and delete conflicts. This is the default for a database created in a provisioned cluster or serverless namespace. 
+ SERIALIZABLE isolation – Provides full serializability for concurrent transactions. 

FROM ARN '<ARN>'  
The AWS Glue database ARN to use to create the database.

\$1 WITH DATA CATALOG SCHEMA '<schema>' \$1 WITH NO DATA CATALOG SCHEMA \$1  
This parameter is only applicable if your CREATE DATABASE command also uses the FROM ARN parameter.
Specifies whether to create the database using a schema to help access objects in the AWS Glue Data Catalog.

IAM\$1ROLE \$1 default \$1 'SESSION' \$1 'arn:aws:iam::*<AWS account-id>*:role/*<role-name>*' \$1  
This parameter is only applicable if your CREATE DATABASE command also uses the FROM ARN parameter.
If you specify an IAM role that is associated with the cluster when running the CREATE DATABASE command, Amazon Redshift will use the role’s credentials when you run queries on the database.  
Specifying the `default` keyword means to use the IAM role that's set as the default and associated with the cluster.  
Use `'SESSION'` if you connect to your Amazon Redshift cluster using a federated identity and access the tables from the external schema created using this command. For an example of using a federated identity, see [Using a federated identity to manage Amazon Redshift access to local resources and Amazon Redshift Spectrum external tables](https://docs.aws.amazon.com/redshift/latest/mgmt/authorization-fas-spectrum.html), which explains how to configure federated identity.   
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. As a minimum, the IAM role must have permission to perform a LIST operation on the Amazon S3 bucket to be accessed and a GET operation on the Amazon S3 objects the bucket contains. To learn more about using IAM\$1ROLE when creating a database using AWS Glue Data Catalog for datashares, see [Working with Lake Formation-managed datashares as a consumer](https://docs.aws.amazon.com/redshift/latest/dg/lake-formation-getting-started-consumer.html).  
The following shows the syntax for the IAM\$1ROLE parameter string for a single ARN.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-name>'
```
You can chain roles so that your cluster can assume another IAM role, possibly belonging to another account. You can chain up to 10 roles. For more information, see [Chaining IAM roles in Amazon Redshift Spectrum](c-spectrum-iam-policies.md#c-spectrum-chaining-roles).   
 To this IAM role, attach an IAM permissions policy similar to the following.    
****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AccessSecret",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetResourcePolicy",
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecretVersionIds"
            ],
            "Resource": "arn:aws:secretsmanager:us-west-2:123456789012:secret:my-rds-secret-VNenFy"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetRandomPassword",
                "secretsmanager:ListSecrets"
            ],
            "Resource": "*"
        }
    ]
}
```
For the steps to create an IAM role to use with federated query, see [Creating a secret and an IAM role to use federated queries](federated-create-secret-iam-role.md).   
Don't include spaces in the list of chained roles.
The following shows the syntax for chaining three roles.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-1-name>,arn:aws:iam::<aws-account-id>:role/<role-2-name>,arn:aws:iam::<aws-account-id>:role/<role-3-name>'
```

## Syntax for using CREATE DATABASE with a datashare
<a name="r_CREATE_DATABASE-datashare-synopsis"></a>

The following syntax describes the CREATE DATABASE command used to create databases from a datashare for sharing data within the same AWS account.

```
CREATE DATABASE database_name
[ [ WITH PERMISSIONS ] FROM DATASHARE datashare_name OF [ ACCOUNT account_id ] NAMESPACE namespace_guid
```

The following syntax describes the CREATE DATABASE command used to create databases from a datashare for sharing data across AWS accounts.

```
CREATE DATABASE database_name
[ [ WITH PERMISSIONS ] FROM DATASHARE datashare_name OF ACCOUNT account_id NAMESPACE namespace_guid
```

### Parameters for using CREATE DATABASE with a datashare
<a name="r_CREATE_DATABASE-parameters-datashare"></a>

FROM DATASHARE   
A keyword that indicates where the datashare is located.

 *datashare\$1name*   
The name of the datashare that the consumer database is created on.

WITH PERMISSIONS  
Specifies that the database created from the datashare requires object-level permissions to access individual database objects. Without this clause, users or roles granted the USAGE permission on the database will automatically have access to all database objects in the database.

 NAMESPACE *namespace\$1guid*   
A value that specifies the producer namespace that the datashare belongs to.

ACCOUNT *account\$1id*  
A value that specifies the producer account that the datashare belongs to.

## Usage notes for CREATE DATABASE for data sharing
<a name="r_CREATE_DATABASE-usage"></a>

As a database superuser, when you use CREATE DATABASE to create databases from datashares within the AWS account, specify the NAMESPACE option. The ACCOUNT option is optional. When you use CREATE DATABASE to create databases from datashares across AWS accounts, specify both the ACCOUNT and NAMESPACE from the producer.

You can create only one consumer database for one datashare on a consumer cluster. You can't create multiple consumer databases referring to the same datashare.

## CREATE DATABASE from AWS Glue Data Catalog
<a name="r_CREATE_DATABASE_data-catalog"></a>

To create a database using an AWS Glue database ARN, specify the ARN in your CREATE DATABASE command.

```
CREATE DATABASE sampledb FROM ARN <glue-database-arn> WITH NO DATA CATALOG SCHEMA;
```

Optionally, you can also supply a value into the IAM\$1ROLE parameter. For more information about the parameter and accepted values, see [Parameters](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_DATABASE.html#r_CREATE_DATABASE-parameters).

The following are examples that demonstrate how to create a database from an ARN using an IAM role.

```
CREATE DATABASE sampledb FROM ARN <glue-database-arn> WITH NO DATA CATALOG SCHEMA IAM_ROLE <iam-role-arn>
```

```
CREATE DATABASE sampledb FROM ARN <glue-database-arn> WITH NO DATA CATALOG SCHEMA IAM_ROLE default;
```

You can also create a database using a DATA CATALOG SCHEMA.

```
CREATE DATABASE sampledb FROM ARN <glue-database-arn> WITH DATA CATALOG SCHEMA <sample_schema> IAM_ROLE default;
```

## Create databases to receive results of zero-ETL integrations
<a name="r_CREATE_DATABASE-integration"></a>

To create a database using a zero-ETL integration identity, specify the `integration_id` in your CREATE DATABASE command.

```
CREATE DATABASE destination_db_name FROM INTEGRATION 'integration_id';
```

For example, first, retrieve the integration ids from SVV\$1INTEGRATION;

```
SELECT integration_id FROM SVV_INTEGRATION;
```

Then use one of the integration ids retrieved to create the database that receives zero-ETL integrations.

```
CREATE DATABASE sampledb FROM INTEGRATION 'a1b2c3d4-5678-90ab-cdef-EXAMPLE11111';
```

When the zero-ETL integrations source database is needed, then, for example, specify.

```
CREATE DATABASE sampledb FROM INTEGRATION 'a1b2c3d4-5678-90ab-cdef-EXAMPLE11111' DATABASE sourcedb;
```

You can also set a refresh interval for the database. For example, to set the refresh interval to 7,200 seconds for data from a zero-ETL integration source:

```
CREATE DATABASE myacct_mysql FROM INTEGRATION 'a1b2c3d4-5678-90ab-cdef-EXAMPLE11111' SET REFRESH_INTERVAL 7200;
```

Query the SVV\$1INTEGRATION catalog view for information about a zero-ETL integration, such as, integration\$1id, target\$1database, source, refresh\$1interval, and more.

```
SELECT * FROM svv_integration;
```

The following example creates a database from an integration with history mode on.

```
CREATE DATABASE sample_integration_db FROM INTEGRATION 'a1b2c3d4-5678-90ab-cdef-EXAMPLE11111' SET HISTORY_MODE = true;
```

## CREATE DATABASE limits
<a name="r_CREATE_DATABASE-create-database-limits"></a>

Amazon Redshift enforces these limits for databases:
+ Maximum of 60 user-defined databases per cluster.
+ Maximum of 127 bytes for a database name.
+ A database name can't be a reserved word. 

## Database collation
<a name="r_CREATE_DATABASE-collation"></a>

Collation is a set of rules that defines how database engine compares and sorts the character type data in SQL. Case-insensitive collation is the most commonly used collation. Amazon Redshift uses case-insensitive collation to facilitate migration from other data warehouse systems. With the native support of case-insensitive collation, Amazon Redshift continues to use important tuning or optimization methods, such as distribution keys, sort keys, or range restricted scan. 

The COLLATE clause specifies the default collation for all CHAR and VARCHAR columns in the database. If CASE\$1INSENSITIVE is specified, all CHAR or VARCHAR columns use case-insensitive collation. For information about collation, see [Collation sequences](c_collation_sequences.md).

Data inserted or ingested in case-insensitive columns will keep its original case. But all comparison-based string operations including sorting and grouping are case-insensitive. Pattern matching operations such as LIKE predicates, similar to, and regular expression functions are also case-insensitive.

The following SQL operations support applicable collation semantics:
+ Comparison operators: =, <>, <, <=, >, >=.
+ LIKE operator
+ ORDER BY clauses
+ GROUP BY clauses
+ Aggregate functions that use string comparison, such as MIN and MAX and LISTAGG
+ Window functions, such as PARTITION BY clauses and ORDER BY clauses
+ Scalar functions greatest() and least(), STRPOS(), REGEXP\$1COUNT(), REGEXP\$1REPLACE(), REGEXP\$1INSTR(), REGEXP\$1SUBSTR()
+ Distinct clause
+ UNION, INTERSECT and EXCEPT
+ IN LIST

For external queries, including Amazon Redshift Spectrum and Aurora PostgreSQL federated queries, collation of VARCHAR or CHAR column is the same as the current database-level collation.

The following example queries a Amazon Redshift Spectrum table:

```
SELECT ci_varchar FROM spectrum.test_collation
WHERE ci_varchar = 'AMAZON';

ci_varchar
----------
amazon
Amazon
AMAZON
AmaZon
(4 rows)
```

For information on how to create tables using database collation, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

For information on the COLLATE function, see [COLLATE function](r_COLLATE.md).

### Database collation limitations
<a name="r_CREATE_DATABASE-collation-limitations"></a>

The following are limitations when working with database collation in Amazon Redshift:
+ All system tables or views, including PG catalog tables and Amazon Redshift system tables are case-sensitive.
+ When consumer database and producer database have different database-level collations, Amazon Redshift doesn't support cross-database and cross-cluster queries.
+ Amazon Redshift doesn't support case-insensitive collation in leader node-only query.

  The following example shows an unsupported case-insensitive query and the error that Amazon Redshift sends:

  ```
  SELECT collate(usename, 'case_insensitive') FROM pg_user;
  ERROR:  Case insensitive collation is not supported in leader node only query.
  ```
+ Amazon Redshift doesn't support interaction between case-sensitive and case-insensitive columns, such as comparison, function, join, or set operations.

  The following examples show errors when case-sensitive and case-insensitive columns interact:

  ```
  CREATE TABLE test
    (ci_col varchar(10) COLLATE case_insensitive,
     cs_col varchar(10) COLLATE case_sensitive,
     cint int,
     cbigint bigint);
  ```

  ```
  SELECT ci_col = cs_col FROM test;
  ERROR:  Query with different collations is not supported yet.
  ```

  ```
  SELECT concat(ci_col, cs_col) FROM test;
  ERROR:  Query with different collations is not supported yet.
  ```

  ```
  SELECT ci_col FROM test UNION SELECT cs_col FROM test;
  ERROR:  Query with different collations is not supported yet.
  ```

  ```
  SELECT * FROM test a, test b WHERE a.ci_col = b.cs_col;
  ERROR:  Query with different collations is not supported yet.
  ```

  ```
  Select Coalesce(ci_col, cs_col) from test;
  ERROR:  Query with different collations is not supported yet.
  ```

  ```
  Select case when cint > 0 then ci_col else cs_col end from test;
  ERROR:  Query with different collations is not supported yet.
  ```

To make these queries work, use the COLLATE function to convert collation of one column to match the other. For more information, see [COLLATE function](r_COLLATE.md).

## Examples
<a name="r_CREATE_DATABASE-examples"></a>

**Creating a database**  
The following example creates a database named TICKIT and gives ownership to the user DWUSER.

```
create database tickit
with owner dwuser;
```

To view details about databases, query the PG\$1DATABASE\$1INFO catalog table. 

```
select datname, datdba, datconnlimit
from pg_database_info
where datdba > 1;

 datname     | datdba | datconnlimit
-------------+--------+-------------
 admin       |    100 | UNLIMITED
 reports     |    100 | 100
 tickit      |    100 | 100
```

The following example creates a database named **sampledb** with SNAPSHOT isolation level.

```
CREATE DATABASE sampledb ISOLATION LEVEL SNAPSHOT;
```

The following example creates the database sales\$1db from the datashare salesshare.

```
CREATE DATABASE sales_db FROM DATASHARE salesshare OF NAMESPACE '13b8833d-17c6-4f16-8fe4-1a018f5ed00d';
```

### Database collation examples
<a name="r_CREATE_DATABASE-collation-examples"></a>

**Creating a case-insensitive database**  
The following example creates the `sampledb` database, creates the `T1` table, and inserts data into the `T1` table.

```
create database sampledb collate case_insensitive;
```

Connect to the new database that you just created using your SQL client. When using Amazon Redshift query editor v2, choose the `sampledb` in the **Editor**. When using RSQL, use a command like the following.

```
\connect sampledb;
```

```
CREATE TABLE T1 (
  col1 Varchar(20) distkey sortkey
);
```

```
INSERT INTO T1 VALUES ('bob'), ('john'), ('Mary'), ('JOHN'), ('Bob');
```

Then the query finds results with `John`.

```
SELECT * FROM T1 WHERE col1 = 'John';

 col1
 ------
 john
 JOHN
(2 row)
```

**Ordering in a case-insensitive order**  
The following example shows the case-insensitive ordering with table T1. The ordering of *Bob* and *bob* or *John* and *john* is nondeterministic because they are equal in case-insensitive column.

```
SELECT * FROM T1 ORDER BY 1;

 col1
 ------
 bob
 Bob
 JOHN
 john
 Mary
(5 rows)
```

Similarly, the following example shows case-insensitive ordering with the GROUP BY clause. *Bob* and *bob* are equal and belong to the same group. It is nondeterministic which one shows up in the result.

```
SELECT col1, count(*) FROM T1 GROUP BY 1;

 col1 | count
 -----+------
 Mary |  1
 bob  |  2
 JOHN |  2
(3 rows)
```

**Querying with a window function on case-insensitive columns**  
The following example queries a window function on a case-insensitive column.

```
SELECT col1, rank() over (ORDER BY col1) FROM T1;

 col1 | rank
 -----+------
 bob  |   1
 Bob  |   1
 john |   3
 JOHN |   3
 Mary |   5
(5 rows)
```

**Querying with the DISTINCT keyword**  
The following example queries the `T1` table with the DISTINCT keyword.

```
SELECT DISTINCT col1 FROM T1;

 col1
 ------
 bob
 Mary
 john
(3 rows)
```

**Querying with the UNION clause**  
The following example shows the results from the UNION of the tables `T1` and `T2`.

```
CREATE TABLE T2 AS SELECT * FROM T1;
```

```
SELECT col1 FROM T1 UNION SELECT col1 FROM T2;

 col1
 ------
 john
 bob
 Mary
(3 rows)
```

# CREATE DATASHARE
<a name="r_CREATE_DATASHARE"></a>

Creates a new datashare in the current database. The owner of this datashare is the issuer of the CREATE DATASHARE command.

Amazon Redshift associates each datashare with a single Amazon Redshift database. You can only add objects from the associated database to a datashare. You can create multiple datashares on the same Amazon Redshift database.

For information about datashares, see [Data sharing in Amazon Redshift](datashare-overview.md).

To view information about the datashares, use [SHOW DATASHARES](r_SHOW_DATASHARES.md).

## Required privileges
<a name="r_CREATE_DATASHARE-privileges"></a>

Following are required privileges for CREATE DATASHARE:
+ Superuser
+ Users with the CREATE DATASHARE privilege
+ Database owner

## Syntax
<a name="r_CREATE_DATASHARE-synopsis"></a>

```
CREATE DATASHARE datashare_name
[[SET] PUBLICACCESSIBLE [=] TRUE | FALSE ];
```

## Parameters
<a name="r_CREATE_DATASHARE-parameters"></a>

*datashare\$1name*  
The name of the datashare. The datashare name must be unique in the cluster namespace.

[[SET] PUBLICACCESSIBLE]  
A clause that specifies whether the datashare can be shared to clusters that are publicly accessible.  
The default value for `SET PUBLICACCESSIBLE` is `FALSE`.

## Usage notes
<a name="r_CREATE_DATASHARE_usage"></a>

By default, the owner of the datashare only owns the share but not objects within the share.

Only superusers and the database owner can use CREATE DATASHARE and delegate ALTER privileges to other users or groups. 

## Examples
<a name="r_CREATE_DATASHARE_examples"></a>

The following example creates the datashare `salesshare`.

```
CREATE DATASHARE salesshare;
```

The following example creates the datashare `demoshare` that AWS Data Exchange manages.

```
CREATE DATASHARE demoshare SET PUBLICACCESSIBLE TRUE, MANAGEDBY ADX;
```

# CREATE EXTERNAL FUNCTION
<a name="r_CREATE_EXTERNAL_FUNCTION"></a>

Creates a scalar user-defined function (UDF) based on AWS Lambda for Amazon Redshift. For more information about Lambda user-defined functions, see [Scalar Lambda UDFs](udf-creating-a-lambda-sql-udf.md).

## Required privileges
<a name="r_CREATE_EXTERNAL_FUNCTION-privileges"></a>

Following are required privileges for CREATE EXTERNAL FUNCTION:
+ Superuser
+ Users with the CREATE [ OR REPLACE ] EXTERNAL FUNCTION privilege

## Syntax
<a name="r_CREATE_EXTERNAL_FUNCTION-synopsis"></a>

```
CREATE [ OR REPLACE ] EXTERNAL FUNCTION external_fn_name ( [data_type] [, ...] )
RETURNS data_type
{ VOLATILE | STABLE }
LAMBDA 'lambda_fn_name'
IAM_ROLE { default | ‘arn:aws:iam::<AWS account-id>:role/<role-name>’
RETRY_TIMEOUT milliseconds
MAX_BATCH_ROWS count
MAX_BATCH_SIZE size [ KB | MB ];
```

## Parameters
<a name="r_CREATE_EXTERNAL_FUNCTION-parameters"></a>

OR REPLACE  
A clause that specifies that if a function with the same name and input argument data types, or *signature*, as this one already exists, the existing function is replaced. You can only replace a function with a new function that defines an identical set of data types. You must be a superuser to replace a function.  
If you define a function with the same name as an existing function but a different signature, you create a new function. In other words, the function name is overloaded. For more information, see [Overloading function names](udf-naming-udfs.md#udf-naming-overloading-function-names).

*external\$1fn\$1name*  
The name of the external function. If you specify a schema name (such as myschema.myfunction), the function is created using the specified schema. Otherwise, the function is created in the current schema. For more information about valid names, see [Names and identifiers](r_names.md).   
We recommend that you prefix all UDF names with `f_`. Amazon Redshift reserves the `f_` prefix for UDF names. By using the `f_` prefix, you help ensure that your UDF name won't conflict with any built-in SQL function names for Amazon Redshift now or in the future. For more information, see [Preventing UDF naming conflicts](udf-naming-udfs.md).

*data\$1type*  
The data type for the input arguments. For more information, see [Scalar Python UDFs](udf-creating-a-scalar-udf.md) and [Scalar Lambda UDFs](udf-creating-a-lambda-sql-udf.md).

RETURNS *data\$1type*  
The data type of the value returned by the function. The RETURNS data type can be any standard Amazon Redshift data type. For more information, see [Scalar Python UDFs](udf-creating-a-scalar-udf.md) and [Scalar Lambda UDFs](udf-creating-a-lambda-sql-udf.md).

VOLATILE \$1 STABLE  
Informs the query optimizer about the volatility of the function.   
To get the best optimization, label your function with the strictest volatility category that is valid for it. In order of strictness, beginning with the least strict, the volatility categories are as follows:  
+ VOLATILE
+ STABLE
VOLATILE  
Given the same arguments, the function can return different results on successive calls, even for the rows in a single statement. The query optimizer cannot make assumptions about the behavior of a volatile function. A query that uses a volatile function must reevaluate the function for every input.  
STABLE  
Given the same arguments, the function is guaranteed to return the same results on successive calls processed within a single statement. The function can return different results when called in different statements. This category makes it so the optimizer can reduce the number of times the function is called within a single statement.  
Note that if the chosen strictness is not valid for the function, there is a risk that the optimizer might skip some calls based on this strictness. This can result in an incorrect result set.  
The IMMUTABLE clause isn't currently supported for Lambda UDFs.

LAMBDA *'lambda\$1fn\$1name'*   
 The name of the function that Amazon Redshift calls.  
For steps to create an AWS Lambda function, see [Create a Lambda function with the console](https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html) in the *AWS Lambda Developer Guide*.  
For information regarding permissions required for the Lambda function, see [AWS Lambda permissions](https://docs.aws.amazon.com/lambda/latest/dg/lambda-permissions.html) in the *AWS Lambda Developer Guide*.

IAM\$1ROLE \$1 default \$1 ‘arn:aws:iam::*<AWS account-id>*:role/*<role-name>*’   
Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE EXTERNAL FUNCTION command runs.  
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. The CREATE EXTERNAL FUNCTION command is authorized to invoke Lambda functions through this IAM role. If your cluster has an existing IAM role with permissions to invoke Lambda functions attached, you can substitute your role's ARN. For more information, see [Configuring the authorization parameter for Lambda UDFs](udf-creating-a-lambda-sql-udf.md#udf-lambda-authorization).  
The following shows the syntax for the IAM\$1ROLE parameter.  

```
IAM_ROLE 'arn:aws:iam::aws-account-id:role/role-name'
```

RETRY\$1TIMEOUT *milliseconds*   
The amount of total time in milliseconds that Amazon Redshift uses for the delays in retry backoffs.   
Instead of retrying immediately for any failed queries, Amazon Redshift performs backoffs and waits for a certain amount of time between retries. Then Amazon Redshift retries the request to rerun the failed query until the sum of all the delays is equal to or exceeds the RETRY\$1TIMEOUT value that you specified. The default value is 20,000 milliseconds.  
When a Lambda function is invoked, Amazon Redshift retries for queries that receive errors such as `TooManyRequestsException`, `EC2ThrottledException`, and `ServiceException`.   
You can set the RETRY\$1TIMEOUT parameter to 0 milliseconds to prevent any retries for a Lambda UDF.

MAX\$1BATCH\$1ROWS *count*  
 The maximum number of rows that Amazon Redshift sends in a single batch request for a single lambda invocation.   
 This parameter's minimum value is 1. The maximum value is INT\$1MAX, or 2,147,483,647.   
 This parameter is optional. The default value is INT\$1MAX, or 2,147,483,647. 

MAX\$1BATCH\$1SIZE *size* [ KB \$1 MB ]   
 The maximum size of the data payload that Amazon Redshift sends in a single batch request for a single lambda invocation.   
 This parameter's minimum value is 1 KB. The maximum value is 5 MB.   
 This parameter's default value is 5 MB.   
 KB and MB are optional. If you don't set the unit of measurement, Amazon Redshift defaults to using KB. 

## Usage notes
<a name="r_CREATE_FUNCTION-usage-notes"></a>

Consider the following when you create Lambda UDFs: 
+ The order of Lambda function calls on the input arguments isn't fixed or guaranteed. It might vary between instances of running queries, depending on the cluster configuration.
+ The functions are not guaranteed to be applied to each input argument once and only once. The interaction between Amazon Redshift and AWS Lambda might lead to repetitive calls with the same inputs.

## Examples
<a name="r_CREATE_FUNCTION-examples"></a>

Following are examples of using scalar Lambda user-defined functions (UDFs).

### Scalar Lambda UDF example using a Node.js Lambda function
<a name="r_CREATE_FUNCTION-lambda-example-node"></a>

The following example creates an external function called `exfunc_sum` that takes two integers as input arguments. This function returns the sum as an integer output. The name of the Lambda function to be called is `lambda_sum`. The language used for this Lambda function is Node.js 12.x. Make sure to specify the IAM role. The example uses `'arn:aws:iam::123456789012:user/johndoe'` as the IAM role.

```
CREATE EXTERNAL FUNCTION exfunc_sum(INT,INT)
RETURNS INT
VOLATILE
LAMBDA 'lambda_sum'
IAM_ROLE 'arn:aws:iam::123456789012:role/Redshift-Exfunc-Test';
```

The Lambda function takes in the request payload and iterates over each row. All the values in a single row are added to calculate the sum for that row, which is saved in the response array. The number of rows in the results array is similar to the number of rows received in the request payload. 

The JSON response payload must have the result data in the 'results' field for it to be recognized by the external function. The arguments field in the request sent to the Lambda function contains the data payload. There can be multiple rows in the data payload in case of a batch request. The following Lambda function iterates over all the rows in the request data payload. It also individually iterates over all the values within a single row.

```
exports.handler = async (event) => {
    // The 'arguments' field in the request sent to the Lambda function contains the data payload.
    var t1 = event['arguments'];

    // 'len(t1)' represents the number of rows in the request payload.
    // The number of results in the response payload should be the same as the number of rows received.
    const resp = new Array(t1.length);

    // Iterating over all the rows in the request payload.
    for (const [i, x] of t1.entries())
    {
        var sum = 0;
        // Iterating over all the values in a single row.
        for (const y of x) {
            sum = sum + y;
        }
        resp[i] = sum;
    }
    // The 'results' field should contain the results of the lambda call.
    const response = {
        results: resp
    };
    return JSON.stringify(response);
};
```

The following example calls the external function with literal values.

```
select exfunc_sum(1,2);
exfunc_sum
------------
 3
(1 row)
```

The following example creates a table called t\$1sum with two columns, c1 and c2, of the integer data type and inserts two rows of data. Then the external function is called by passing the column names of this table. The two table rows are sent in a batch request in request payload as a single Lambda invocation.

```
CREATE TABLE t_sum(c1 int, c2 int);
INSERT INTO t_sum VALUES (4,5), (6,7);
SELECT exfunc_sum(c1,c2) FROM t_sum;
 exfunc_sum
---------------
 9
 13
(2 rows)
```

### Scalar Lambda UDF example using the RETRY\$1TIMEOUT attribute
<a name="r_CREATE_FUNCTION-lambda-example-retry"></a>

In the following section, you can find an example of how to use the RETRY\$1TIMEOUT attribute in Lambda UDFs. 

AWS Lambda functions have concurrency limits that you can set for each function. For more information on concurrency limits, see [Managing concurrency for a Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html) in the *AWS Lambda Developer Guide* and the post [Managing AWS Lambda Function Concurrency](https://aws.amazon.com/blogs/compute/managing-aws-lambda-function-concurrency) on the AWS Compute Blog. 

When the number of requests being served by a Lambda UDF exceeds the concurrency limits, the new requests receive the `TooManyRequestsException` error. The Lambda UDF retries on this error until the sum of all the delays between the requests sent to the Lambda function is equal to or exceeds the RETRY\$1TIMEOUT value that you set. The default RETRY\$1TIMEOUT value is 20,000 milliseconds.

The following example uses a Lambda function named `exfunc_sleep_3`. This function takes in the request payload, iterates over each row, and converts the input to uppercase. It then sleeps for 3 seconds and returns the result. The language used for this Lambda function is Python 3.8. 

The number of rows in the results array is similar to the number of rows received in the request payload. The JSON response payload must have the result data in the `results` field for it to be recognized by the external function. The `arguments` field in the request sent to the Lambda function contains the data payload. In the case of a batch request, multiple rows can appear in the data payload.

The concurrency limit for this function is specifically set to 1 in reserved concurrency to demonstrate the use of the RETRY\$1TIMEOUT attribute. When the attribute is set to 1, the Lambda function can only serve one request at a time.

```
import json
import time
def lambda_handler(event, context):
    t1 = event['arguments']
    # 'len(t1)' represents the number of rows in the request payload.
    # The number of results in the response payload should be the same as the number of rows received.
    resp = [None]*len(t1)

    # Iterating over all rows in the request payload.
    for i, x in enumerate(t1):
        # Iterating over all the values in a single row.
        for j, y in enumerate(x):
            resp[i] = y.upper()

    time.sleep(3)
    ret = dict()
    ret['results'] = resp
    ret_json = json.dumps(ret)
    return ret_json
```

Following, two additional examples illustrate the RETRY\$1TIMEOUT attribute. They each invoke a single Lambda UDF. While invoking the Lambda UDF, each example runs the same SQL query to invoke the Lambda UDF from two concurrent database sessions at the same time. When first query that invokes the Lambda UDF is being served by the UDF, the second query receives the `TooManyRequestsException` error. This result occurs because you specifically set the reserved concurrency in the UDF to 1. For information on how to set reserved concurrency for Lambda functions, see [Configuring reserved concurrency](https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html#configuration-concurrency-reservedconfiguration-concurrency-reserved).

The first example, following, sets the RETRY\$1TIMEOUT attribute for the Lambda UDF to 0 milliseconds. If the Lambda request receives any exceptions from the Lambda function, Amazon Redshift doesn't make any retries. This result occurs because the RETRY\$1TIMEOUT attribute is set to 0.

```
CREATE OR REPLACE EXTERNAL FUNCTION exfunc_upper(varchar)
RETURNS varchar
VOLATILE
LAMBDA 'exfunc_sleep_3'
IAM_ROLE 'arn:aws:iam::123456789012:role/Redshift-Exfunc-Test'
RETRY_TIMEOUT 0;
```

With the RETRY\$1TIMEOUT set to 0, you can run the following two queries from separate database sessions to see different results.

The first SQL query that uses the Lambda UDF runs successfully.

```
select exfunc_upper('Varchar');
 exfunc_upper
 --------------
 VARCHAR
(1 row)
```

The second query, which is run from a separate database session at the same time, receives the `TooManyRequestsException` error.

```
select exfunc_upper('Varchar');
ERROR:  Rate Exceeded.; Exception: TooManyRequestsException; ShouldRetry: 1
DETAIL:
-----------------------------------------------
error:  Rate Exceeded.; Exception: TooManyRequestsException; ShouldRetry: 1
code:      32103
context:query:     0
location:  exfunc_client.cpp:102
process:   padbmaster [pid=26384]
-----------------------------------------------
```

The second example, following, sets the RETRY\$1TIMEOUT attribute for the Lambda UDF to 3,000 milliseconds. Even if the second query is run concurrently, the Lambda UDF retries until the total delays is 3,000 milliseconds. Thus, both queries run successfully.

```
CREATE OR REPLACE EXTERNAL FUNCTION exfunc_upper(varchar)
RETURNS varchar
VOLATILE
LAMBDA 'exfunc_sleep_3'
IAM_ROLE 'arn:aws:iam::123456789012:role/Redshift-Exfunc-Test'
RETRY_TIMEOUT 3000;
```

With the RETRY\$1TIMEOUT set to 3,000 milliseconds, you can run the following two queries from separate database sessions to see the same results.

The first SQL query that runs the Lambda UDF runs successfully.

```
select exfunc_upper('Varchar');
 exfunc_upper
 --------------
 VARCHAR
(1 row)
```

The second query runs concurrently, and the Lambda UDF retries until the total delay is 3,000 milliseconds.

```
select exfunc_upper('Varchar');
 exfunc_upper
--------------
 VARCHAR
(1 row)
```

### Scalar Lambda UDF example using a Python Lambda function
<a name="r_CREATE_FUNCTION-lambda-example-python"></a>

The following example creates an external function that is named `exfunc_multiplication` and that multiplies numbers and returns an integer. This example incorporates the success and `error_msg` fields in the Lambda response. The success field is set to false when there is an integer overflow in the multiplication result, and the `error_msg` message is set to `Integer multiplication overflow`. The `exfunc_multiplication` function takes three integers as input arguments and returns the sum as an integer output. 

The name of the Lambda function that is called is `lambda_multiplication`. The language used for this Lambda function is Python 3.8. Make sure to specify the IAM role.

```
CREATE EXTERNAL FUNCTION exfunc_multiplication(int, int, int)
RETURNS INT
VOLATILE
LAMBDA 'lambda_multiplication'
IAM_ROLE 'arn:aws:iam::123456789012:role/Redshift-Exfunc-Test';
```

The Lambda function takes in the request payload and iterates over each row. All the values in a single row are multiplied to calculate the result for that row, which is saved in the response list. This example uses a Boolean success value that is set to true by default. If the multiplication result for a row has an integer overflow, then the success value is set to false. Then the iteration loop breaks. 

While creating the response payload, if the success value is false, the following Lambda function adds the `error_msg` field in the payload. It also sets the error message to `Integer multiplication overflow`. If the success value is true, then the result data is added in the results field. The number of rows in the results array, if any, is similar to the number of rows received in the request payload. 

The arguments field in the request sent to the Lambda function contains the data payload. There can be multiple rows in the data payload in case of a batch request. The following Lambda function iterates over all the rows in the request data payload and individually iterates over all the values within a single row. 

```
import json
def lambda_handler(event, context):
    t1 = event['arguments']
    # 'len(t1)' represents the number of rows in the request payload.
    # The number of results in the response payload should be the same as the number of rows received.
    resp = [None]*len(t1)

    # By default success is set to 'True'.
    success = True
    # Iterating over all rows in the request payload.
    for i, x in enumerate(t1):
        mul = 1
        # Iterating over all the values in a single row.
        for j, y in enumerate(x):
            mul = mul*y

        # Check integer overflow.
        if (mul >= 9223372036854775807 or mul <= -9223372036854775808):
            success = False
            break
        else:
            resp[i] = mul
    ret = dict()
    ret['success'] = success
    if not success:
        ret['error_msg'] = "Integer multiplication overflow"
    else:
        ret['results'] = resp
    ret_json = json.dumps(ret)

    return ret_json
```

The following example calls the external function with literal values.

```
SELECT exfunc_multiplication(8, 9, 2);
  exfunc_multiplication
---------------------------
          144
(1 row)
```

The following example creates a table named t\$1multi with three columns, c1, c2, and c3, of the integer data type. The external function is called by passing the column names of this table. The data is inserted in such a way to cause integer overflow to show how the error is propagated.

```
CREATE TABLE t_multi (c1 int, c2 int, c3 int);
INSERT INTO t_multi VALUES (2147483647, 2147483647, 4);
SELECT exfunc_multiplication(c1, c2, c3) FROM t_multi;
DETAIL:
  -----------------------------------------------
  error:  Integer multiplication overflow
  code:      32004context:
  context:
  query:     38
  location:  exfunc_data.cpp:276
  process:   query2_16_38 [pid=30494]
  -----------------------------------------------
```

# CREATE EXTERNAL MODEL
<a name="r_create_external_model"></a>

**Topics**
+ [Prerequisites for CREATE EXTERNAL MODEL](#r_create_external_model_prereqs)
+ [Required privileges](#r_simple_create_model-privileges)
+ [Cost control](#r_create_model_cost)
+ [CREATE EXTERNAL MODEL syntax](#r_create_external_model_syntax)
+ [CREATE EXTERNAL MODEL parameters and settings](#r_create_external_model_parameters_settings)
+ [CREATE EXTERNAL MODEL inference function parameters](#r_create_external_model_if_parameters)

## Prerequisites for CREATE EXTERNAL MODEL
<a name="r_create_external_model_prereqs"></a>

Before you use the CREATE EXTERNAL MODEL statement, complete the prerequisites in [Cluster setup for using Amazon Redshift ML](getting-started-machine-learning.md#cluster-setup). The following is a high-level summary of the prerequisites.
+ Create an Amazon Redshift cluster with the AWS Management Console or the AWS Command Line Interface (AWS CLI).
+ Attach the AWS Identity and Access Management (IAM) policy while creating the cluster.
+ To allow Amazon Redshift and Amazon Bedrock to assume the role to interact with other services, add the appropriate trust policy to the IAM role.
+ Enable access to the specific LLMs that you want to use from the Amazon Bedrock console.
+ (Optional) If you encounter throttling exceptions coming from Amazon Bedrock such as `Too many requests, please wait before trying again`, even with small amounts of data, check the quotas under **Service Quotas** in your Amazon Bedrock account. Check that the applied account-level quota is at least the same as the AWS default quota value for the **InvokeModel** requests for the model you are using.

For details for the IAM role, trust policy, and other prerequisites, see [Cluster setup for using Amazon Redshift ML](getting-started-machine-learning.md#cluster-setup).

## Required privileges
<a name="r_simple_create_model-privileges"></a>

Following are required privileges for CREATE EXTERNAL MODEL:
+ Superuser
+ Users with the CREATE MODEL privilege
+ Roles with the GRANT CREATE MODEL privilege

## Cost control
<a name="r_create_model_cost"></a>

 Amazon Redshift ML uses existing cluster resources to create prediction models, so you don’t have to pay additional costs. However, AWS charges for using Amazon Bedrock based on the model you select. For more information, see [Costs for using Amazon Redshift ML](https://docs.aws.amazon.com/redshift/latest/dg/cost.html). 

## CREATE EXTERNAL MODEL syntax
<a name="r_create_external_model_syntax"></a>

The following is the full syntax of the CREATE EXTERNAL MODEL statement.

```
CREATE EXTERNAL MODEL model_name 
FUNCTION function_name
IAM_ROLE {default/'arn:aws:iam::<account-id>:role/<role-name>'}
MODEL_TYPE BEDROCK
SETTINGS (
   MODEL_ID model_id
   [, PROMPT 'prompt prefix']
   [, SUFFIX 'prompt suffix']
   [, REQUEST_TYPE {RAW|UNIFIED}]
   [, RESPONSE_TYPE {VARCHAR|SUPER}]
);
```

The `CREATE EXTERNAL MODEL` command creates an inference function that you use to generate content. 

The following is the syntax of an inference function that `CREATE EXTERNAL MODEL` creates using a `REQUEST_TYPE` of `RAW`: 

```
SELECT inference_function_name(request_super) 
[FROM table];
```

The following is the syntax of an inference function that `CREATE EXTERNAL MODEL` creates using a `REQUEST_TYPE` of `UNIFIED`: 

```
SELECT inference_function_name(input_text, [, inference_config [, additional_model_request_fields]])
[FROM table];
```

For information about how to use the inference function, see [Using an external model for Amazon Redshift ML integration with Amazon Bedrock](machine-learning-br.md#machine-learning-br-use).

## CREATE EXTERNAL MODEL parameters and settings
<a name="r_create_external_model_parameters_settings"></a>

This section describes the parameters and settings for the `CREATE EXTERNAL MODEL` command.

**Topics**
+ [CREATE EXTERNAL MODEL parameters](#r_create_external_model_parameters)
+ [CREATE EXTERNAL MODEL settings](#r_create_external_model_settings)

### CREATE EXTERNAL MODEL parameters
<a name="r_create_external_model_parameters"></a>

model\$1name  
The name for the external model. The model name in a schema must be unique.

FUNCTION *function\$1name (data\$1type [,...] )*  
The name for the inference function that `CREATE EXTERNAL MODEL` creates. You use the inference function to send requests to Amazon Bedrock and retrieve ML-generated text.

IAM\$1ROLE * \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>' \$1*  
The IAM role that Amazon Redshift uses to access Amazon Bedrock. For information about the IAM role, see [Creating or updating an IAM role for Amazon Redshift ML integration with Amazon Bedrock](machine-learning-br.md#machine-learning-br-iam).

MODEL\$1TYPE BEDROCK  
Specifies the model type. The only valid value is `BEDROCK`.

SETTINGS ( MODEL\$1ID model\$1id [,...] )  
Specifies the external model settings. See the section following for details.

### CREATE EXTERNAL MODEL settings
<a name="r_create_external_model_settings"></a>

MODEL\$1ID model\$1id  
The identifier for the external model, for example, `anthropic.claude-v2`. For information about Amazon Bedrock model IDs, see [Amazon Bedrock model IDs](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html).

PROMPT 'prompt prefix'  
Specifies a static prompt that Amazon Redshift adds to the beginning of every inference request. Only supported with a `REQUEST_TYPE` of `UNIFIED`.

SUFFIX 'prompt suffix'  
Specifies a static prompt that Amazon Redshift adds to the end of every inference request. Only supported with a `REQUEST_TYPE` of `UNIFIED`.

REQUEST\$1TYPE \$1 RAW \$1 UNIFIED \$1  
Specifies the format of the request sent to Amazon Bedrock. Valid values include the following:  
+ **RAW**: The inference function takes the input as a single super value, and always returns a super value. The format of the super value is specific to to the Amazon Bedrock model selected. A super is a prediction model that combines multiple algorithms to produce a single, improved prediction.
+ **UNIFIED**: The inference function uses the unified API. All models have a unified and consistent interface with Amazon Bedrock. This works for all models that support messages. This value is the default.

  For more information, see the [Converse API documentation](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) in the *Amazon Bedrock API documentation*.

RESPONSE\$1TYPE \$1 VARCHAR \$1 SUPER \$1  
Specifies the format of the response. If the `REQUEST_TYPE` is `RAW`, the `RESPONSE_TYPE` is requred and the only valid value is `SUPER`. For all other `REQUEST TYPE` values, the default value is `VARCHAR`, and `RESPONSE_TYPE` is optional. Valid values include the following:  
+ **VARCHAR**: Amazon Redshift only returns the text response generated by the model.
+ **SUPER**: Amazon Redshift returns the whole response JSON generated by the model as a super. This includes the text response, and information such as the stop reason, and the model input and output token usage. A super is a prediction model that combines multiple algorithms to produce a single, improved prediction. 

## CREATE EXTERNAL MODEL inference function parameters
<a name="r_create_external_model_if_parameters"></a>

This section describes valid parameters for the inference function that the `CREATE EXTERNAL MODEL` command creates. 

### CREATE EXTERNAL MODEL inference function parameters for `REQUEST_TYPE` of `RAW`
<a name="r_create_external_model_if_parameters_raw"></a>

An inference function created with a `REQUEST_TYPE` of `RAW` has one super input argument and always returns a super data type. The syntax of the input super follows the syntax of the request of the specific model selected from Amazon Bedrock.

### CREATE EXTERNAL MODEL inference function parameters for `REQUEST_TYPE` of `UNIFIED`
<a name="r_create_external_model_if_parameters_unified"></a>

input\$1text  
The text that Amazon Redshift sends to Amazon Bedrock.

inference\$1config  
A super value that contains optional parameters that Amazon Redshift sends to Amazon Bedrock. These can include the following:  
+ maxTokens
+ stopSequences
+ temperature
+ topP
These parameters are all optional and are all case-sensitive. For information about these parameters, see [ InferenceConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InferenceConfiguration.html) in the *Amazon Bedrock API Reference*.

# CREATE EXTERNAL SCHEMA
<a name="r_CREATE_EXTERNAL_SCHEMA"></a>

Creates a new external schema in the current database. You can use this external schema to connect to Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-Compatible Edition databases. You can also create an external schema that references a database in an external data catalog such as AWS Glue, Athena, or a database in an Apache Hive metastore, such as Amazon EMR.

The owner of this schema is the issuer of the CREATE EXTERNAL SCHEMA command. To transfer ownership of an external schema, use [ALTER SCHEMA](r_ALTER_SCHEMA.md) to change the owner. To grant access to the schema to other users or user groups, use the [GRANT](r_GRANT.md) command. 

You can't use the GRANT or REVOKE commands for permissions on an external table. Instead, grant or revoke the permissions on the external schema. 

**Note**  
If you currently have Redshift Spectrum external tables in the Amazon Athena data catalog, you can migrate your Athena data catalog to an AWS Glue Data Catalog. To use the AWS Glue Data Catalog with Redshift Spectrum, you might need to change your AWS Identity and Access Management (IAM) policies. For more information, see [Upgrading to the AWS Glue Data Catalog](https://docs.aws.amazon.com/athena/latest/ug/glue-athena.html#glue-upgrade) in the *Athena User Guide*.

To view details for external schemas, query the [SVV\$1EXTERNAL\$1SCHEMAS](r_SVV_EXTERNAL_SCHEMAS.md) system view. 

## Syntax
<a name="r_CREATE_EXTERNAL_SCHEMA-synopsis"></a>

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference data using an external data catalog. For more information, see [Amazon Redshift Spectrum](c-using-spectrum.md).

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] local_schema_name
FROM [ [ DATA CATALOG ] | HIVE METASTORE | POSTGRES | MYSQL | KINESIS | MSK | REDSHIFT | KAFKA ]
[ DATABASE 'database_name' ]
[ SCHEMA 'schema_name' ]
[ REGION 'aws-region' ]
[ IAM_ROLE [ default | 'SESSION' | 'arn:aws:iam::<AWS account-id>:role/<role-name>' ] ]
[ AUTHENTICATION [ none | iam | mtls] ]
[ AUTHENTICATION_ARN 'acm-certificate-arn' | SECRET_ARN 'ssm-secret- arn' ]
[ URI ['hive_metastore_uri' [ PORT port_number ] | 'hostname' [ PORT port_number ] | 'Kafka bootstrap URL'] ] 
[ CLUSTER_ARN 'arn:aws:kafka:<region>:<AWS account-id>:cluster/msk/<cluster uuid>' ]
[ CATALOG_ROLE [ 'SESSION' | 'catalog-role-arn-string' ] ]
[ CREATE EXTERNAL DATABASE IF NOT EXISTS ]
[ CATALOG_ID 'Amazon Web Services account ID containing Glue or Lake Formation database' ]
```

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference data using a federated query to RDS POSTGRES or Aurora PostgreSQL. You can also create an external schema that references streaming sources, such as Kinesis Data Streams. For more information, see [Querying data with federated queries in Amazon Redshift](federated-overview.md).

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] local_schema_name
FROM POSTGRES
DATABASE 'federated_database_name' [SCHEMA 'schema_name']
URI 'hostname' [ PORT port_number ]
IAM_ROLE [ default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' ]
SECRET_ARN 'ssm-secret-arn'
```

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference data using a federated query to RDS MySQL or Aurora MySQL. For more information, see [Querying data with federated queries in Amazon Redshift](federated-overview.md).

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] local_schema_name
FROM MYSQL
DATABASE 'federated_database_name'
URI 'hostname' [ PORT port_number ]
IAM_ROLE [ default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' ]
SECRET_ARN 'ssm-secret-arn'
```

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference data in a Kinesis stream. For more information, see [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] schema_name
FROM KINESIS
IAM_ROLE [ default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' ]
```

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference the Amazon Managed Streaming for Apache Kafka or Confluent Cloud cluster and its topics to ingest from. To connect, you provide the broker URI. For more information, see [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] schema_name
FROM KAFKA
[ IAM_ROLE [ default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' ] ]
URI 'Kafka bootstrap URI'
AUTHENTICATION [ none | iam | mtls ]
[ AUTHENTICATION_ARN 'acm-certificate-arn' | SECRET_ARN 'ssm-secret- arn' ];
```

The following syntax describes the CREATE EXTERNAL SCHEMA command used to reference data using a cross-database query.

```
CREATE EXTERNAL SCHEMA local_schema_name
FROM  REDSHIFT
DATABASE 'redshift_database_name' SCHEMA 'redshift_schema_name'
```

## Parameters
<a name="r_CREATE_EXTERNAL_SCHEMA-parameters"></a>

IF NOT EXISTS  
A clause that indicates that if the specified schema already exists, the command should make no changes and return a message that the schema exists, rather than terminating with an error. This clause is useful when scripting, so the script doesn't fail if CREATE EXTERNAL SCHEMA tries to create a schema that already exists. 

local\$1schema\$1name  
The name of the new external schema. For more information about valid names, see [Names and identifiers](r_names.md).

FROM [ DATA CATALOG ] \$1 HIVE METASTORE \$1 POSTGRES \$1 MYSQL \$1 KINESIS \$1 MSK \$1 REDSHIFT   
A keyword that indicates where the external database is located.   
DATA CATALOG indicates that the external database is defined in the Athena data catalog or the AWS Glue Data Catalog.   
If the external database is defined in an external Data Catalog in a different AWS Region, the REGION parameter is required. DATA CATALOG is the default.  
HIVE METASTORE indicates that the external database is defined in an Apache Hive metastore. If HIVE METASTORE, is specified, URI is required.   
POSTGRES indicates that the external database is defined in RDS PostgreSQL or Aurora PostgreSQL.  
MYSQL indicates that the external database is defined in RDS MySQL or Aurora MySQL.  
KINESIS indicates that the data source is a stream from Kinesis Data Streams.  
MSK indicates that the data source is an Amazon MSK provisioned or serverless cluster.  
KAFKA indicates that the data source is a Kafka cluster. You can use this keyword for both Amazon MSK and Confluent Cloud.

FROM REDSHIFT  
A keyword that indicates that the database is located in Amazon Redshift.

DATABASE '*redshift\$1database\$1name*' SCHEMA '*redshift\$1schema\$1name*'  
The name of the Amazon Redshift database.   
The *redshift\$1schema\$1name* indicates the schema in Amazon Redshift. The default *redshift\$1schema\$1name* is `public`.

DATABASE '*federated\$1database\$1name*'  
A keyword that indicates the name of the external database in a supported PostgreSQL or MySQL database engine. 

[SCHEMA '*schema\$1name*']  
The *schema\$1name* indicates the schema in a supported PostgreSQL database engine. The default *schema\$1name* is `public`.  
You can't specify a SCHEMA when you set up a federated query to a supported MySQL database engine. 

REGION '*aws-region*'  
If the external database is defined in an Athena data catalog or the AWS Glue Data Catalog, the AWS Region in which the database is located. This parameter is required if the database is defined in an external Data Catalog. 

URI [ 'hive\$1metastore\$1uri' [ PORT port\$1number ] \$1 'hostname' [ PORT port\$1number ] \$1 'Kafka bootstrap URI' ]  
The hostname URI and port\$1number of a supported PostgreSQL or MySQL database engine. The *hostname* is the head node of the replica set. The endpoint must be reachable (routable) from the Amazon Redshift cluster. The default PostgreSQL port\$1number is 5432. The default MySQL port\$1number is 3306.  
The supported PostgreSQL or MySQL database engine must be in the same VPC as your Amazon Redshift cluster with a security group linking Amazon Redshift and RDS url-rsPostgreSQL or Aurora PostgreSQL. Additionally, you can use enhanced VPC routing to configure a cross-VPC use case. For more information, see [Redshift-managed VPC endpoints](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-cross-vpc.html).
**Specifying a hive metastore URI**  
If the database is in a Hive metastore, specify the URI and optionally the port number for the metastore. The default port number is 9083.   
A URI doesn't contain a protocol specification ("http://"). An example valid URI: `uri '172.10.10.10'`.   
**Specifying a broker URI for streaming ingestion**  
Including the bootstrap-broker URI provides the ability to connect to an Amazon MSK or Confluent Cloud cluster and receive streamed data. For more information and to see an example, see [Getting started with streaming ingestion from Amazon Managed Streaming for Apache Kafka](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion-getting-started-MSK.html).

IAM\$1ROLE [ default \$1 'SESSION' \$1 'arn:aws:iam::*<AWS account-id>*:role/*<role-name>*' ]  
Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE EXTERNAL SCHEMA command runs.  
Use `'SESSION'` if you connect to your Amazon Redshift cluster using a federated identity and access the tables from the external schema created using this command. For more information, see [Using a federated identity to manage Amazon Redshift access to local resources and Amazon Redshift Spectrum external tables](https://docs.aws.amazon.com/redshift/latest/mgmt/authorization-fas-spectrum.html), which explains how to configure federated identity. Note that this configuration, using `'SESSION'` in place of the ARN, can be used only if the schema is created using `DATA CATALOG`.   
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. As a minimum, the IAM role must have permission to perform a LIST operation on the Amazon S3 bucket to be accessed and a GET operation on the Amazon S3 objects the bucket contains.  
The following shows the syntax for the IAM\$1ROLE parameter string for a single ARN.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-name>'
```
You can chain roles so that your cluster can assume another IAM role, possibly belonging to another account. You can chain up to 10 roles. For an example of chaining roles, see [Chaining IAM roles in Amazon Redshift Spectrum](c-spectrum-iam-policies.md#c-spectrum-chaining-roles).   
 To this IAM role, attach an IAM permissions policy similar to the following.    
****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AccessSecret",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetResourcePolicy",
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecretVersionIds"
            ],
            "Resource": "arn:aws:secretsmanager:us-west-2:123456789012:secret:my-rds-secret-VNenFy"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetRandomPassword",
                "secretsmanager:ListSecrets"
            ],
            "Resource": "*"
        }
    ]
}
```
For the steps to create an IAM role to use with federated query, see [Creating a secret and an IAM role to use federated queries](federated-create-secret-iam-role.md).   
Don't include spaces in the list of chained roles.
The following shows the syntax for chaining three roles.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-1-name>,arn:aws:iam::<aws-account-id>:role/<role-2-name>,arn:aws:iam::<aws-account-id>:role/<role-3-name>'
```

SECRET\$1ARN '*ssm-secret-arn*'  
The Amazon Resource Name (ARN) of a supported PostgreSQL or MySQL database engine secret created using AWS Secrets Manager. For information about how to create and retrieve an ARN for a secret, see [Manage secrets with AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/manage_create-basic-secret.html) in the *AWS Secrets Manager User Guide*, and [Retrieving the Amazon Resource Name (ARN) of the secret in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-secrets-manager-integration-retrieving-secret.html).. 

CATALOG\$1ROLE [ 'SESSION' \$1 *catalog-role-arn-string*]  
Use `'SESSION'` to connect to your Amazon Redshift cluster using a federated identity for authentication and authorization to the data catalog. For more information about completing the steps for federated identity, see [Using a federated identity to manage Amazon Redshift access to local resources and Amazon Redshift Spectrum external tables](https://docs.aws.amazon.com/redshift/latest/mgmt/authorization-fas-spectrum.html). Note that the `'SESSION'` role can be used only if the schema is created in DATA CATALOG.  
Use the Amazon Resource Name ARN for an IAM role that your cluster uses for authentication and authorization for the data catalog.   
If CATALOG\$1ROLE isn't specified, Amazon Redshift uses the specified IAM\$1ROLE. The catalog role must have permission to access the Data Catalog in AWS Glue or Athena. For more information, see [IAM policies for Amazon Redshift Spectrum](c-spectrum-iam-policies.md).   
The following shows the syntax for the CATALOG\$1ROLE parameter string for a single ARN.  

```
CATALOG_ROLE 'arn:aws:iam::<aws-account-id>:role/<catalog-role>'
```
You can chain roles so that your cluster can assume another IAM role, possibly belonging to another account. You can chain up to 10 roles. For more information, see [Chaining IAM roles in Amazon Redshift Spectrum](c-spectrum-iam-policies.md#c-spectrum-chaining-roles).   
The list of chained roles must not include spaces.
The following shows the syntax for chaining three roles.  

```
CATALOG_ROLE 'arn:aws:iam::<aws-account-id>:role/<catalog-role-1-name>,arn:aws:iam::<aws-account-id>:role/<catalog-role-2-name>,arn:aws:iam::<aws-account-id>:role/<catalog-role-3-name>'
```


CREATE EXTERNAL DATABASE IF NOT EXISTS  
A clause that creates an external database with the name specified by the DATABASE argument, if the specified external database doesn't exist. If the specified external database exists, the command makes no changes. In this case, the command returns a message that the external database exists, rather than terminating with an error.  
You can't use CREATE EXTERNAL DATABASE IF NOT EXISTS with HIVE METASTORE.  
To use CREATE EXTERNAL DATABASE IF NOT EXISTS with a Data Catalog enabled for AWS Lake Formation, you need `CREATE_DATABASE` permission on the Data Catalog. 

CATALOG\$1ID '*Amazon Web Services account ID containing Glue or Lake Formation database*'  
The account id where the data catalog database is stored.  
`CATALOG_ID` can be specified only if you plan to connect to your Amazon Redshift cluster or to Amazon Redshift Serverless using a federated identity for authentication and authorization to the data catalog by setting either of the following:   
+ `CATALOG_ROLE` to `'SESSION'`
+ `IAM_ROLE` to `'SESSION'` and `'CATALOG_ROLE'` set to its default 
For more information about completing the steps for federated identity, see [Using a federated identity to manage Amazon Redshift access to local resources and Amazon Redshift Spectrum external tables](https://docs.aws.amazon.com/redshift/latest/mgmt/authorization-fas-spectrum.html). 

AUTHENTICATION  
The authentication type defined for streaming ingestion. Streaming ingestion with authentication types works with Amazon Managed Streaming for Apache Kafka. The `AUTHENTICATION` types are the following:  
+ **none** – Specifies that there is no authentication required. This corresponds to Unauthenticated access on MSK or plaintext with TLS on Apache Kafka.
+ **iam** – Specifies IAM authentication. When you choose this, make sure that the IAM role has permissions for IAM authentication. For more information about defining the external schema, see [Getting started with streaming ingestion from Apache Kafka sources](materialized-view-streaming-ingestion-getting-started-MSK.md).
+ **mtls** – Specifies that mutual transport layer security provides secure communication by facilitating authentication between a client and server. In this case, the client is Redshift and the server is Amazon MSK. For more information about configuring streaming ingestion with mTLS, see [Authentication with mTLS for Redshift streaming ingestion from Apache Kafka sources](materialized-view-streaming-ingestion-mtls.md).


AUTHENTICATION\$1ARN  
The ARN of the AWS Certificate Manager certificate used by Amazon Redshift for mtls authentication with Amazon MSK. The ARN is available in the ACM console when you choose the issued certificate.

CLUSTER\$1ARN  
For streaming ingestion, the CLUSTER\$1ARN is the cluster identifier for the Amazon Managed Streaming for Apache Kafka cluster you're streaming from. When using CLUSTER\$1ARN, it requires an IAM role policy that includes the `kafka:GetBootstrapBrokers` permission. This option is provided for backward compatibility. Currently, we recommend using the bootstrap-broker URI option to connect to Amazon Managed Streaming for Apache Kafka clusters. For more information, see [Streaming ingestion](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html).

## Usage notes
<a name="r_CREATE_EXTERNAL_SCHEMA_usage"></a>

For limits when using the Athena data catalog, see [Athena Limits](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#amazon-athena-limits) in the AWS General Reference.

For limits when using the AWS Glue Data Catalog, see [AWS Glue Limits](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_glue) in the AWS General Reference.

These limits don’t apply to a Hive metastore.

There is a maximum of 9,900 schemas per database. For more information, see [Quotas and limits](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.

To unregister the schema, use the [DROP SCHEMA](r_DROP_SCHEMA.md) command. 

To view details for external schemas, query the following system views: 
+ [SVV\$1EXTERNAL\$1SCHEMAS](r_SVV_EXTERNAL_SCHEMAS.md) 
+ [SVV\$1EXTERNAL\$1TABLES](r_SVV_EXTERNAL_TABLES.md) 
+ [SVV\$1EXTERNAL\$1COLUMNS](r_SVV_EXTERNAL_COLUMNS.md) 

## Examples
<a name="r_CREATE_EXTERNAL_SCHEMA_examples"></a>

The following example creates an external schema using a database in a data catalog named `sampledb` in the US West (Oregon) Region. Use this example with an Athena or AWS Glue data catalog.

```
create external schema spectrum_schema
from data catalog
database 'sampledb'
region 'us-west-2'
iam_role 'arn:aws:iam::123456789012:role/MySpectrumRole';
```

The following example creates an external schema and creates a new external database named `spectrum_db`.

```
create external schema spectrum_schema
from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::123456789012:role/MySpectrumRole'
create external database if not exists;
```

The following example creates an external schema using a Hive metastore database named `hive_db`.

```
create external schema hive_schema
from hive metastore
database 'hive_db'
uri '172.10.10.10' port 99
iam_role 'arn:aws:iam::123456789012:role/MySpectrumRole';
```

The following example chains roles to use the role `myS3Role` for accessing Amazon S3 and uses `myAthenaRole` for data catalog access. For more information, see [Chaining IAM roles in Amazon Redshift Spectrum](c-spectrum-iam-policies.md#c-spectrum-chaining-roles).

```
create external schema spectrum_schema
from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::123456789012:role/myRedshiftRole,arn:aws:iam::123456789012:role/myS3Role'
catalog_role 'arn:aws:iam::123456789012:role/myAthenaRole'
create external database if not exists;
```

The following example creates an external schema that references an Aurora PostgreSQL database. 

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] myRedshiftSchema
FROM POSTGRES
DATABASE 'my_aurora_db' SCHEMA 'my_aurora_schema'
URI 'endpoint to aurora hostname' PORT 5432  
IAM_ROLE 'arn:aws:iam::123456789012:role/MyAuroraRole'
SECRET_ARN 'arn:aws:secretsmanager:us-east-2:123456789012:secret:development/MyTestDatabase-AbCdEf'
```

The following example creates an external schema to refer to the sales\$1db imported on the consumer cluster.

```
CREATE EXTERNAL SCHEMA sales_schema FROM REDSHIFT DATABASE 'sales_db' SCHEMA 'public';
```

The following example creates an external schema that references an Aurora MySQL database. 

```
CREATE EXTERNAL SCHEMA [IF NOT EXISTS] myRedshiftSchema
FROM MYSQL
DATABASE 'my_aurora_db'
URI 'endpoint to aurora hostname'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyAuroraRole'
SECRET_ARN 'arn:aws:secretsmanager:us-east-2:123456789012:secret:development/MyTestDatabase-AbCdEf'
```

# CREATE EXTERNAL TABLE
<a name="r_CREATE_EXTERNAL_TABLE"></a>

Creates a new external table in the specified schema. All external tables must be created in an external schema. Search path isn't supported for external schemas and external tables. For more information, see [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md).

In addition to external tables created using the CREATE EXTERNAL TABLE command, Amazon Redshift can reference external tables defined in an AWS Glue or AWS Lake Formation catalog or an Apache Hive metastore. Use the [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md) command to register an external database defined in the external catalog and make the external tables available for use in Amazon Redshift. If the external table exists in an AWS Glue or AWS Lake Formation catalog or Hive metastore, you don't need to create the table using CREATE EXTERNAL TABLE. To view external tables, query the [SVV\$1EXTERNAL\$1TABLES](r_SVV_EXTERNAL_TABLES.md) system view. 

By running the CREATE EXTERNAL TABLE AS command, you can create an external table based on the column definition from a query and write the results of that query into Amazon S3. The results are in Apache Parquet or delimited text format. If the external table has a partition key or keys, Amazon Redshift partitions new files according to those partition keys and registers new partitions into the external catalog automatically. For more information about CREATE EXTERNAL TABLE AS, see [Usage notes](r_CREATE_EXTERNAL_TABLE_usage.md). 

You can query an external table using the same SELECT syntax you use with other Amazon Redshift tables. You can also use the INSERT syntax to write new files into the location of external table on Amazon S3. For more information, see [INSERT (external table)](r_INSERT_external_table.md).

To create a view with an external table, include the WITH NO SCHEMA BINDING clause in the [CREATE VIEW](r_CREATE_VIEW.md) statement.

You can't run CREATE EXTERNAL TABLE inside a transaction (BEGIN … END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

## Required privileges
<a name="r_CREATE_EXTERNAL_TABLE-privileges"></a>

To create external tables, you must be the owner of the external schema or a superuser. To transfer ownership of an external schema, use ALTER SCHEMA to change the owner. Access to external tables is controlled by access to the external schema. You can't [GRANT](r_GRANT.md) or [REVOKE](r_REVOKE.md) permissions on an external table. Instead, grant or revoke USAGE on the external schema.

The [Usage notes](r_CREATE_EXTERNAL_TABLE_usage.md) have additional information about specific permissions for external tables.

## Syntax
<a name="r_CREATE_EXTERNAL_TABLE-synopsis"></a>

```
CREATE EXTERNAL TABLE
external_schema.table_name
(column_name data_type [, …] )
[ PARTITIONED BY (col_name data_type [, … ] )]
[ { ROW FORMAT DELIMITED row_format |
  ROW FORMAT SERDE 'serde_name'
  [ WITH SERDEPROPERTIES ( 'property_name' = 'property_value' [, ...] ) ] } ]
STORED AS file_format
LOCATION { 's3://bucket/folder/' | 's3://bucket/manifest_file' }
[ TABLE PROPERTIES ( 'property_name'='property_value' [, ...] ) ]
```

The following is the syntax for CREATE EXTERNAL TABLE AS.

```
CREATE EXTERNAL TABLE
external_schema.table_name
[ PARTITIONED BY (col_name [, … ] ) ]
[ ROW FORMAT DELIMITED row_format ]
STORED AS file_format
LOCATION { 's3://bucket/folder/' }
[ TABLE PROPERTIES ( 'property_name'='property_value' [, ...] ) ]
 AS
 { select_statement }
```

## Parameters
<a name="r_CREATE_EXTERNAL_TABLE-parameters"></a>

 *external\$1schema.table\$1name*   
The name of the table to be created, qualified by an external schema name. External tables must be created in an external schema. For more information, see [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md).  
The maximum length for the table name is 127 bytes; longer names are truncated to 127 bytes. You can use UTF-8 multibyte characters up to a maximum of four bytes. Amazon Redshift enforces a limit of 9,900 tables per cluster, including user-defined temporary tables and temporary tables created by Amazon Redshift during query processing or system maintenance. Optionally, you can qualify the table name with the database name. In the following example, the database name is `spectrum_db`, the external schema name is `spectrum_schema`, and the table name is `test`.  

```
create external table spectrum_db.spectrum_schema.test (c1 int)
stored as parquet
location 's3://amzn-s3-demo-bucket/myfolder/';
```
If the database or schema specified doesn't exist, the table isn't created, and the statement returns an error. You can't create tables or views in the system databases `template0`, `template1`, `padb_harvest`, or `sys:internal`.  
The table name must be a unique name for the specified schema.   
For more information about valid names, see [Names and identifiers](r_names.md).

( *column\$1name* *data\$1type* )  
The name and data type of each column being created.  
The maximum length for the column name is 127 bytes; longer names are truncated to 127 bytes. You can use UTF-8 multibyte characters up to a maximum of four bytes. You can't specify column names `"$path"` or `"$size"`. For more information about valid names, see [Names and identifiers](r_names.md).  
By default, Amazon Redshift creates external tables with the pseudocolumns `$path` and `$size`. You can disable creation of pseudocolumns for a session by setting the `spectrum_enable_pseudo_columns` configuration parameter to `false`. For more information, see [Pseudocolumns](r_CREATE_EXTERNAL_TABLE_usage.md#r_CREATE_EXTERNAL_TABLE_usage-pseudocolumns).  
If pseudocolumns are enabled, the maximum number of columns you can define in a single table is 1,598. If pseudocolumns aren't enabled, the maximum number of columns you can define in a single table is 1,600.   
If you are creating a "wide table," make sure that your list of columns doesn't exceed row-width boundaries for intermediate results during loads and query processing. For more information, see [Usage notes](r_CREATE_TABLE_NEW.md#r_CREATE_TABLE_usage).  
For a CREATE EXTERNAL TABLE AS command, a column list is not required, because columns are derived from the query.

 *data\$1type*   
The following [Data types](c_Supported_data_types.md) are supported:  
+ SMALLINT (INT2)
+ INTEGER (INT, INT4)
+ BIGINT (INT8)
+ DECIMAL (NUMERIC)
+ REAL (FLOAT4)
+ DOUBLE PRECISION (FLOAT8)
+ BOOLEAN (BOOL)
+ CHAR (CHARACTER)
+ VARCHAR (CHARACTER VARYING)
+ VARBYTE (CHARACTER VARYING) – can be used with Parquet and ORC data files, and only with non-partitioned tables.
+ DATE – can be used only with text, Parquet, or ORC data files, or as a partition column.
+ TIMESTAMP
  
For DATE, you can use the formats as described following. For month values represented using digits, the following formats are supported:  
+ `mm-dd-yyyy` For example, `05-01-2017`. This is the default.
+ `yyyy-mm-dd`, where the year is represented by more than 2 digits. For example, `2017-05-01`.
For month values represented using the three letter abbreviation, the following formats are supported:  
+ `mmm-dd-yyyy` For example, `may-01-2017`. This is the default.
+ `dd-mmm-yyyy`, where the year is represented by more than 2 digits. For example, `01-may-2017`.
+ `yyyy-mmm-dd`, where the year is represented by more than 2 digits. For example, `2017-may-01`.
For year values that are consistently less than 100, the year is calculated in the following manner:  
+ If year is less than 70, the year is calculated as the year plus 2000. For example, the date 05-01-17 in the `mm-dd-yyyy` format is converted into `05-01-2017`.
+ If year is less than 100 and greater than 69, the year is calculated as the year plus 1900. For example the date 05-01-89 in the `mm-dd-yyyy` format is converted into `05-01-1989`.
+ For year values represented by two digits, add leading zeroes to represent the year in 4 digits.
Timestamp values in text files must be in the format `yyyy-mm-dd HH:mm:ss.SSSSSS`, as the following timestamp value shows: `2017-05-01 11:30:59.000000`.  
The length of a VARCHAR column is defined in bytes, not characters. For example, a VARCHAR(12) column can contain 12 single-byte characters or 6 two-byte characters. When you query an external table, results are truncated to fit the defined column size without returning an error. For more information, see [Storage and ranges](r_Character_types.md#r_Character_types-storage-and-ranges).   
For best performance, we recommend specifying the smallest column size that fits your data. To find the maximum size in bytes for values in a column, use the [OCTET\$1LENGTH](r_OCTET_LENGTH.md) function. The following example returns the maximum size of values in the email column.  

```
select max(octet_length(email)) from users;

max
---
 62
```

PARTITIONED BY (*col\$1name* *data\$1type* [, … ] )  
A clause that defines a partitioned table with one or more partition columns. A separate data directory is used for each specified combination, which can improve query performance in some circumstances. Partitioned columns don't exist within the table data itself. If you use a value for *col\$1name* that is the same as a table column, you get an error.   
After creating a partitioned table, alter the table using an [ALTER TABLE](r_ALTER_TABLE.md) … ADD PARTITION statement to register new partitions to the external catalog. When you add a partition, you define the location of the subfolder on Amazon S3 that contains the partition data.  
For example, if the table `spectrum.lineitem_part` is defined with `PARTITIONED BY (l_shipdate date)`, run the following ALTER TABLE command to add a partition.  

```
ALTER TABLE spectrum.lineitem_part ADD PARTITION (l_shipdate='1992-01-29')
LOCATION 's3://spectrum-public/lineitem_partition/l_shipdate=1992-01-29';
```
If you are using CREATE EXTERNAL TABLE AS, you don't need to run ALTER TABLE...ADD PARTITION. Amazon Redshift automatically registers new partitions in the external catalog. Amazon Redshift also automatically writes corresponding data to partitions in Amazon S3 based on the partition key or keys defined in the table.  
To view partitions, query the [SVV\$1EXTERNAL\$1PARTITIONS](r_SVV_EXTERNAL_PARTITIONS.md) system view.  
For a CREATE EXTERNAL TABLE AS command, you don't need to specify the data type of the partition column because this column is derived from the query. 

ROW FORMAT DELIMITED *rowformat*  
A clause that specifies the format of the underlying data. Possible values for *rowformat* are as follows:  
+ LINES TERMINATED BY '*delimiter*' 
+ FIELDS TERMINATED BY '*delimiter*' 
Specify a single ASCII character for '*delimiter*'. You can specify non-printing ASCII characters using octal, in the format `'\`*`ddd`*`'` where *`d`* is an octal digit (0–7) up to ‘\$1177’. The following example specifies the BEL (bell) character using octal.   

```
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\007'
```
If ROW FORMAT is omitted, the default format is DELIMITED FIELDS TERMINATED BY '\$1A' (start of heading) and LINES TERMINATED BY '\$1n' (newline). 

ROW FORMAT SERDE '*serde\$1name*' [WITH SERDEPROPERTIES ( '*property\$1name*' = '*property\$1value*' [, ...] ) ]  
A clause that specifies the SERDE format for the underlying data.     
'*serde\$1name*'  
The name of the SerDe. You can specify the following formats:  
+ org.apache.hadoop.hive.serde2.RegexSerDe 
+ com.amazonaws.glue.serde.GrokSerDe 
+ org.apache.hadoop.hive.serde2.OpenCSVSerde 

  This parameter supports the following SerDe property for OpenCSVSerde: 

  ```
  'wholeFile' = 'true' 
  ```

  Set the `wholeFile` property to `true` to properly parse new line characters (\$1n) within quoted strings for OpenCSV requests. 
+ org.openx.data.jsonserde.JsonSerDe
  + The JSON SERDE also supports Ion files. 
  + The JSON must be well-formed. 
  + Timestamps in Ion and JSON must use ISO8601 format.
  + This parameter supports the following SerDe property for JsonSerDe: 

    ```
    'strip.outer.array'='true' 
    ```

    Processes Ion/JSON files containing one very large array enclosed in outer brackets ( [ … ] ) as if it contains multiple JSON records within the array. 
+ com.amazon.ionhiveserde.IonHiveSerDe

  The Amazon ION format provides text and binary formats, in addition to data types. For an external table that references data in ION format, you map each column in the external table to the corresponding element in the ION format data. For more information, see [Amazon Ion](https://amzn.github.io/ion-docs/). You also need to specify the input and output formats.  
WITH SERDEPROPERTIES ( '*property\$1name*' = '*property\$1value*' [, ...] ) ]  
Optionally, specify property names and values, separated by commas.
If ROW FORMAT is omitted, the default format is DELIMITED FIELDS TERMINATED BY '\$1A' (start of heading) and LINES TERMINATED BY '\$1n' (newline). 

STORED AS *file\$1format*  
The file format for data files.   
Valid formats are as follows:  
+ PARQUET
+ RCFILE (for data using ColumnarSerDe only, not LazyBinaryColumnarSerDe)
+ SEQUENCEFILE
+ TEXTFILE (for text files, including JSON files).
+ ORC 
+ AVRO 
+ INPUTFORMAT '*input\$1format\$1classname*' OUTPUTFORMAT '*output\$1format\$1classname*' 
The CREATE EXTERNAL TABLE AS command only supports two file formats, TEXTFILE and PARQUET.  
For INPUTFORMAT and OUTPUTFORMAT, specify a class name, as the following example shows.   

```
'org.apache.hadoop.mapred.TextInputFormat'
```

LOCATION \$1 's3://*bucket/folder*/' \$1 's3://*bucket/manifest\$1file*'\$1  <a name="create-external-table-location"></a>
The path to the Amazon S3 bucket or folder that contains the data files or a manifest file that contains a list of Amazon S3 object paths. The buckets must be in the same AWS Region as the Amazon Redshift cluster. For a list of supported AWS Regions, see [Amazon Redshift Spectrum limitations](c-spectrum-considerations.md).  
If the path specifies a bucket or folder, for example `'s3://amzn-s3-demo-bucket/custdata/'`, Redshift Spectrum scans the files in the specified bucket or folder and any subfolders. Redshift Spectrum ignores hidden files and files that begin with a period or underscore.   
If the path specifies a manifest file, the `'s3://bucket/manifest_file'` argument must explicitly reference a single file—for example, `'s3://amzn-s3-demo-bucket/manifest.txt'`. It can't reference a key prefix.   
The manifest is a text file in JSON format that lists the URL of each file that is to be loaded from Amazon S3 and the size of the file, in bytes. The URL includes the bucket name and full object path for the file. The files that are specified in the manifest can be in different buckets, but all the buckets must be in the same AWS Region as the Amazon Redshift cluster. If a file is listed twice, the file is loaded twice. The following example shows the JSON for a manifest that loads three files.   

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata.1", "meta": { "content_length": 5956875 } },
    {"url":"s3://amzn-s3-demo-bucket1/custdata.2", "meta": { "content_length": 5997091 } },
    {"url":"s3://amzn-s3-demo-bucket2/custdata.1", "meta": { "content_length": 5978675 } }
  ]
}
```
You can make the inclusion of a particular file mandatory. To do this, include a `mandatory` option at the file level in the manifest. When you query an external table with a mandatory file that is missing, the SELECT statement fails. Ensure that all files included in the definition of the external table are present. If they aren't all present, an error appears showing the first mandatory file that isn't found. The following example shows the JSON for a manifest with the `mandatory` option set to `true`.  

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket1/custdata.1", "mandatory":true, "meta": { "content_length": 5956875 } },
    {"url":"s3://amzn-s3-demo-bucket1/custdata.2", "mandatory":false, "meta": { "content_length": 5997091 } },
    {"url":"s3://amzn-s3-demo-bucket2/custdata.1", "meta": { "content_length": 5978675 } }
  ]
}
```
To reference files created using UNLOAD, you can use the manifest created using [UNLOAD](r_UNLOAD.md) with the MANIFEST parameter. The manifest file is compatible with a manifest file for [COPY from Amazon S3](copy-parameters-data-source-s3.md), but uses different keys. Keys that aren't used are ignored. 

TABLE PROPERTIES ( '*property\$1name*'='*property\$1value*' [, ...] )   
A clause that sets the table definition for table properties.   
Table properties are case-sensitive.  
 'compression\$1type'='*value*'   
 A property that sets the type of compression to use if the file name doesn't contain an extension. If you set this property and there is a file extension, the extension is ignored and the value set by the property is used. Valid values for compression type are as follows:  
+ bzip2
+ gzip
+ none
+ snappy  
'data\$1cleansing\$1enabled'='true / false’  
This property sets whether data handling is on for the table. When 'data\$1cleansing\$1enabled' is set to true, data handling is on for the table. When 'data\$1cleansing\$1enabled' is set to false, data handling is off for the table. Following is a list of the table–level data handling properties controlled by this property:  
+ column\$1count\$1mismatch\$1handling
+ invalid\$1char\$1handling
+ numeric\$1overflow\$1handling
+ replacement\$1char
+ surplus\$1char\$1handling
For examples, see [Data handling examples](r_CREATE_EXTERNAL_TABLE_examples.md#r_CREATE_EXTERNAL_TABLE_examples-data-handling).  
'invalid\$1char\$1handling'='*value*'   
Specifies the action to perform when query results contain invalid UTF-8 character values. You can specify the following actions:    
DISABLED  
Doesn't perform invalid character handling.  
FAIL  
Cancels queries that return data containing invalid UTF-8 values.  
SET\$1TO\$1NULL   
Replaces invalid UTF-8 values with null.  
DROP\$1ROW  
Replaces each value in the row with null.  
REPLACE  
Replaces the invalid character with the replacement character you specify using `replacement_char`.  
'replacement\$1char'='*character*’  
Specifies the replacement character to use when you set `invalid_char_handling` to `REPLACE`.  
'numeric\$1overflow\$1handling'='value’  
Specifies the action to perform when ORC data contains an integer (for example, BIGINT or int64) that is larger than the column definition (for example, SMALLINT or int16). You can specify the following actions:    
DISABLED  
Invalid character handling is turned off.  
FAIL  
Cancel the query when the data includes invalid characters.  
SET\$1TO\$1NULL  
Set invalid characters to null.  
DROP\$1ROW  
Set each value in the row to null.  
'surplus\$1bytes\$1handling'='*value*'  
Specifies how to handle data being loaded that exceeds the length of the data type defined for columns containing VARBYTE data. By default, Redshift Spectrum sets the value to null for data that exceeds the width of the column.  
You can specify the following actions to perform when the query returns data that exceeds the length of the data type:    
SET\$1TO\$1NULL  
Replaces data that exceeds the column width with null.  
DISABLED  
Doesn't perform surplus byte handling.  
FAIL  
Cancels queries that return data exceeding the column width.  
DROP\$1ROW  
Drop all rows that contain data exceeding column width.  
TRUNCATE  
Removes the characters that exceed the maximum number of characters defined for the column.  
'surplus\$1char\$1handling'='*value*'  
Specifies how to handle data being loaded that exceeds the length of the data type defined for columns containing VARCHAR, CHAR, or string data. By default, Redshift Spectrum sets the value to null for data that exceeds the width of the column.  
You can specify the following actions to perform when the query returns data that exceeds the column width:    
SET\$1TO\$1NULL  
Replaces data that exceeds the column width with null.  
DISABLED  
Doesn't perform surplus character handling.  
FAIL  
Cancels queries that return data exceeding the column width.  
DROP\$1ROW  
Replaces each value in the row with null.  
TRUNCATE  
Removes the characters that exceed the maximum number of characters defined for the column.  
'column\$1count\$1mismatch\$1handling'='value’  
Identifies if the file contains less or more values for a row than the number of columns specified in the external table definition. This property is only available for an uncompressed text file format. You can specify the following actions:    
DISABLED  
Column count mismatch handling is turned off.  
FAIL  
Fail the query if the column count mismatch is detected.  
SET\$1TO\$1NULL  
Fill missing values with NULL and ignore the additional values in each row.  
DROP\$1ROW  
Drop all rows that contain column count mismatch error from the scan.  
'numRows'='*row\$1count*'   
A property that sets the numRows value for the table definition. To explicitly update an external table's statistics, set the numRows property to indicate the size of the table. Amazon Redshift doesn't analyze external tables to generate the table statistics that the query optimizer uses to generate a query plan. If table statistics aren't set for an external table, Amazon Redshift generates a query execution plan based on an assumption that external tables are the larger tables and local tables are the smaller tables.  
'skip.header.line.count'='*line\$1count*'  
A property that sets number of rows to skip at the beginning of each source file.  
'serialization.null.format'=' '  
A property that specifies Spectrum should return a `NULL` value when there is an exact match with the text supplied in a field.  
'orc.schema.resolution'='mapping\$1type'  
A property that sets the column mapping type for tables that use ORC data format. This property is ignored for other data formats.  
Valid values for column mapping type are as follows:   
+ name 
+ position 
If the *orc.schema.resolution* property is omitted, columns are mapped by name by default. If *orc.schema.resolution* is set to any value other than *'name'* or *'position'*, columns are mapped by position. For more information about column mapping, see [Mapping external table columns to ORC columns](c-spectrum-external-tables.md#c-spectrum-column-mapping-orc).  
The COPY command maps to ORC data files only by position. The *orc.schema.resolution* table property has no effect on COPY command behavior.   
'write.parallel'='on / off’  
A property that sets whether CREATE EXTERNAL TABLE AS should write data in parallel. By default, CREATE EXTERNAL TABLE AS writes data in parallel to multiple files, according to the number of slices in the cluster. The default option is on. When 'write.parallel' is set to off, CREATE EXTERNAL TABLE AS writes to one or more data files serially onto Amazon S3. This table property also applies to any subsequent INSERT statement into the same external table.  
‘write.maxfilesize.mb’=‘size’  
A property that sets the maximum size (in MB) of each file written to Amazon S3 by CREATE EXTERNAL TABLE AS. The size must be a valid integer between 5 and 6200. The default maximum file size is 6,200 MB. This table property also applies to any subsequent INSERT statement into the same external table.  
‘write.kms.key.id’=‘*value*’  
You can specify an AWS Key Management Service key to enable Server–Side Encryption (SSE) for Amazon S3 objects, where *value* is one of the following:   
+ `auto` to use the default AWS KMS key stored in the Amazon S3 bucket.
+ *kms-key* that you specify to encrypt data.  
*select\$1statement*  
A statement that inserts one or more rows into the external table by defining any query. All rows that the query produces are written to Amazon S3 in either text or Parquet format based on the table definition.

## Examples
<a name="r_CREATE_EXTERNAL_TABLE_examples_link"></a>

A collection of examples is available at [Examples](r_CREATE_EXTERNAL_TABLE_examples.md).

# Usage notes
<a name="r_CREATE_EXTERNAL_TABLE_usage"></a>

This topic contains usage notes for [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md). You can't view details for Amazon Redshift Spectrum tables using the same resources that you use for standard Amazon Redshift tables, such as [PG\$1TABLE\$1DEF](r_PG_TABLE_DEF.md), [STV\$1TBL\$1PERM](r_STV_TBL_PERM.md), PG\$1CLASS, or information\$1schema. If your business intelligence or analytics tool doesn't recognize Redshift Spectrum external tables, configure your application to query [SVV\$1EXTERNAL\$1TABLES](r_SVV_EXTERNAL_TABLES.md) and [SVV\$1EXTERNAL\$1COLUMNS](r_SVV_EXTERNAL_COLUMNS.md).

## CREATE EXTERNAL TABLE AS
<a name="r_CETAS"></a>

In some cases, you might run the CREATE EXTERNAL TABLE AS command on an AWS Glue Data Catalog, AWS Lake Formation external catalog, or Apache Hive metastore. In such cases, you use an AWS Identity and Access Management (IAM) role to create the external schema. This IAM role must have both read and write permissions on Amazon S3. 

If you use a Lake Formation catalog, the IAM role must have the permission to create table in the catalog. In this case, it must also have the data lake location permission on the target Amazon S3 path. This IAM role becomes the owner of the new AWS Lake Formation table.

To ensure that file names are unique, Amazon Redshift uses the following format for the name of each file uploaded to Amazon S3 by default.

`<date>_<time>_<microseconds>_<query_id>_<slice-number>_part_<part-number>.<format>`.

 An example is `20200303_004509_810669_1007_0001_part_00.parquet`.

Consider the following when running the CREATE EXTERNAL TABLE AS command:
+ The Amazon S3 location must be empty.
+ Amazon Redshift only supports PARQUET and TEXTFILE formats when using the STORED AS clause.
+ You don't need to define a column definition list. Column names and column data types of the new external table are derived directly from the SELECT query.
+ You don't need to define the data type of the partition column in the PARTITIONED BY clause. If you specify a partition key, the name of this column must exist in the SELECT query result. When having multiple partition columns, their order in the SELECT query doesn't matter. Amazon Redshift uses their order defined in the PARTITIONED BY clause to create the external table.
+ Amazon Redshift automatically partitions output files into partition folders based on the partition key values. By default, Amazon Redshift removes partition columns from the output files.
+ The LINES TERMINATED BY 'delimiter' clause isn't supported.
+ The ROW FORMAT SERDE 'serde\$1name' clause isn't supported.
+ The use of manifest files isn't supported. Thus, you can't define the LOCATION clause to a manifest file on Amazon S3.
+ Amazon Redshift automatically updates the 'numRows' table property at the end of the command.
+ The 'compression\$1type' table property only accepts 'none' or 'snappy' for the PARQUET file format.
+ Amazon Redshift doesn't allow the LIMIT clause in the outer SELECT query. Instead, you can use a nested LIMIT clause.
+ You can use STL\$1UNLOAD\$1LOG to track the files that are written to Amazon S3 by each CREATE EXTERNAL TABLE AS operation.

## Permissions to create and query external tables
<a name="r_CREATE_EXTERNAL_TABLE_usage-permissions"></a>

To create external tables, make sure that you're the owner of the external schema or a superuser. To transfer ownership of an external schema, use [ALTER SCHEMA](r_ALTER_SCHEMA.md). The following example changes the owner of the `spectrum_schema` schema to `newowner`.

```
alter schema spectrum_schema owner to newowner;
```

To run a Redshift Spectrum query, you need the following permissions:
+ Usage permission on the schema 
+ Permission to create temporary tables in the current database 

The following example grants usage permission on the schema `spectrum_schema` to the `spectrumusers` user group.

```
grant usage on schema spectrum_schema to group spectrumusers;
```

The following example grants temporary permission on the database `spectrumdb` to the `spectrumusers` user group. 

```
grant temp on database spectrumdb to group spectrumusers;
```

## Pseudocolumns
<a name="r_CREATE_EXTERNAL_TABLE_usage-pseudocolumns"></a>

By default, Amazon Redshift creates external tables with the pseudocolumns *\$1path* and *\$1size*. Select these columns to view the path to the data files on Amazon S3 and the size of the data files for each row returned by a query. The *\$1path* and *\$1size* column names must be delimited with double quotation marks. A *SELECT \$1* clause doesn't return the pseudocolumns . You must explicitly include the *\$1path* and *\$1size* column names in your query, as the following example shows.

```
select "$path", "$size"
from spectrum.sales_part
where saledate = '2008-12-01';
```

You can disable creation of pseudocolumns for a session by setting the *spectrum\$1enable\$1pseudo\$1columns* configuration parameter to *false*. 

**Important**  
Selecting *\$1size* or *\$1path* incurs charges because Redshift Spectrum scans the data files in Amazon S3 to determine the size of the result set. For more information, see [Amazon Redshift Pricing](https://aws.amazon.com/redshift/pricing/).

## Setting data handling options
<a name="r_CREATE_EXTERNAL_TABLE_usage-data-handling"></a>

You can set table parameters to specify input handling for data being queried in external tables, including: 
+ Surplus characters in columns containing VARCHAR, CHAR, and string data. For more information, see the external table property `surplus_char_handling`.
+ Invalid characters in columns containing VARCHAR, CHAR, and string data. For more information, see the external table property `invalid_char_handling`.
+ Replacement character to use when you specify REPLACE for the external table property `invalid_char_handling`.
+ Cast overflow handling in columns containing integer and decimal data. For more information, see the external table property `numeric_overflow_handling`.
+ Surplus\$1bytes\$1handling to specify input handling for surplus bytes in columns containing varbyte data. For more information, see the external table property `surplus_bytes_handling`.

# Examples
<a name="r_CREATE_EXTERNAL_TABLE_examples"></a>

The following example creates a table named SALES in the Amazon Redshift external schema named `spectrum`. The data is in tab-delimited text files. The TABLE PROPERTIES clause sets the numRows property to 170,000 rows.

Depending on the identity you use to run CREATE EXTERNAL TABLE, there may be IAM permissions that you have to configure. As a best practice, we recommend attaching permissions policies to an IAM role and then assigning it to users and groups as needed. For more information, see [Identity and access management in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html).

```
create external table spectrum.sales(
salesid integer,
listid integer,
sellerid integer,
buyerid integer,
eventid integer,
saledate date,
qtysold smallint,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp)
row format delimited
fields terminated by '\t'
stored as textfile
location 's3://redshift-downloads/tickit/spectrum/sales/'
table properties ('numRows'='170000');
```

The following example creates a table that uses the JsonSerDe to reference data in JSON format.

```
create external table spectrum.cloudtrail_json (
event_version int,
event_id bigint,
event_time timestamp,
event_type varchar(10),
awsregion varchar(20),
event_name varchar(max),
event_source varchar(max),
requesttime timestamp,
useragent varchar(max),
recipientaccountid bigint)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties (
'dots.in.keys' = 'true',
'mapping.requesttime' = 'requesttimestamp'
) location 's3://amzn-s3-demo-bucket/json/cloudtrail';
```

The following CREATE EXTERNAL TABLE AS example creates a nonpartitioned external table. Then it writes the result of the SELECT query as Apache Parquet to the target Amazon S3 location.

```
CREATE EXTERNAL TABLE spectrum.lineitem
STORED AS parquet
LOCATION 'S3://amzn-s3-demo-bucket/cetas/lineitem/'
AS SELECT * FROM local_lineitem;
```

The following example creates a partitioned external table and includes the partition columns in the SELECT query. 

```
CREATE EXTERNAL TABLE spectrum.partitioned_lineitem
PARTITIONED BY (l_shipdate, l_shipmode)
STORED AS parquet
LOCATION 'S3://amzn-s3-demo-bucket/cetas/partitioned_lineitem/'
AS SELECT l_orderkey, l_shipmode, l_shipdate, l_partkey FROM local_table;
```

For a list of existing databases in the external data catalog, query the [SVV\$1EXTERNAL\$1DATABASES](r_SVV_EXTERNAL_DATABASES.md) system view. 

```
select eskind,databasename,esoptions from svv_external_databases order by databasename;
```

```
eskind | databasename | esoptions
-------+--------------+----------------------------------------------------------------------------------
     1 | default      | {"REGION":"us-west-2","IAM_ROLE":"arn:aws:iam::123456789012:role/mySpectrumRole"}
     1 | sampledb     | {"REGION":"us-west-2","IAM_ROLE":"arn:aws:iam::123456789012:role/mySpectrumRole"}
     1 | spectrumdb   | {"REGION":"us-west-2","IAM_ROLE":"arn:aws:iam::123456789012:role/mySpectrumRole"}
```

To view details of external tables, query the [SVV\$1EXTERNAL\$1TABLES](r_SVV_EXTERNAL_TABLES.md) and [SVV\$1EXTERNAL\$1COLUMNS](r_SVV_EXTERNAL_COLUMNS.md) system views.

The following example queries the SVV\$1EXTERNAL\$1TABLES view.

```
select schemaname, tablename, location from svv_external_tables;
```

```
schemaname | tablename            | location
-----------+----------------------+--------------------------------------------------------
spectrum   | sales                | s3://redshift-downloads/tickit/spectrum/sales
spectrum   | sales_part           | s3://redshift-downloads/tickit/spectrum/sales_partition
```

The following example queries the SVV\$1EXTERNAL\$1COLUMNS view. 

```
select * from svv_external_columns where schemaname like 'spectrum%' and tablename ='sales';
```

```
schemaname | tablename | columnname | external_type | columnnum | part_key
-----------+-----------+------------+---------------+-----------+---------
spectrum   | sales     | salesid    | int           |         1 |        0
spectrum   | sales     | listid     | int           |         2 |        0
spectrum   | sales     | sellerid   | int           |         3 |        0
spectrum   | sales     | buyerid    | int           |         4 |        0
spectrum   | sales     | eventid    | int           |         5 |        0
spectrum   | sales     | saledate   | date          |         6 |        0
spectrum   | sales     | qtysold    | smallint      |         7 |        0
spectrum   | sales     | pricepaid  | decimal(8,2)  |         8 |        0
spectrum   | sales     | commission | decimal(8,2)  |         9 |        0
spectrum   | sales     | saletime   | timestamp     |        10 |        0
```

To view table partitions, use the following query.

```
select schemaname, tablename, values, location
from svv_external_partitions
where tablename = 'sales_part';
```

```
schemaname | tablename  | values         | location
-----------+------------+----------------+-------------------------------------------------------------------------
spectrum   | sales_part | ["2008-01-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01
spectrum   | sales_part | ["2008-02-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02
spectrum   | sales_part | ["2008-03-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-03
spectrum   | sales_part | ["2008-04-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-04
spectrum   | sales_part | ["2008-05-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-05
spectrum   | sales_part | ["2008-06-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-06
spectrum   | sales_part | ["2008-07-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-07
spectrum   | sales_part | ["2008-08-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-08
spectrum   | sales_part | ["2008-09-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-09
spectrum   | sales_part | ["2008-10-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-10
spectrum   | sales_part | ["2008-11-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-11
spectrum   | sales_part | ["2008-12-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-12
```

The following example returns the total size of related data files for an external table.

```
select distinct "$path", "$size"
   from spectrum.sales_part;

 $path                                                                    | $size
--------------------------------------------------------------------------+-------
s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01/ |  1616
s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02/ |  1444
s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02/ |  1444
```

## Partitioning examples
<a name="r_CREATE_EXTERNAL_TABLE_examples-partitioning"></a>

To create an external table partitioned by date, run the following command.

```
create external table spectrum.sales_part(
salesid integer,
listid integer,
sellerid integer,
buyerid integer,
eventid integer,
dateid smallint,
qtysold smallint,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp)
partitioned by (saledate date)
row format delimited
fields terminated by '|'
stored as textfile
location 's3://redshift-downloads/tickit/spectrum/sales_partition/'
table properties ('numRows'='170000');
```

To add the partitions, run the following ALTER TABLE commands.

```
alter table spectrum.sales_part
add if not exists partition (saledate='2008-01-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-02-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-03-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-03/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-04-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-04/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-05-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-05/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-06-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-06/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-07-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-07/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-08-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-08/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-09-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-09/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-10-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-10/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-11-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-11/';
alter table spectrum.sales_part
add if not exists partition (saledate='2008-12-01')
location 's3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-12/';
```

To select data from the partitioned table, run the following query.

```
select top 10 spectrum.sales_part.eventid, sum(spectrum.sales_part.pricepaid)
from spectrum.sales_part, event
where spectrum.sales_part.eventid = event.eventid
  and spectrum.sales_part.pricepaid > 30
  and saledate = '2008-12-01'
group by spectrum.sales_part.eventid
order by 2 desc;
```

```
eventid | sum
--------+---------
    914 | 36173.00
   5478 | 27303.00
   5061 | 26383.00
   4406 | 26252.00
   5324 | 24015.00
   1829 | 23911.00
   3601 | 23616.00
   3665 | 23214.00
   6069 | 22869.00
   5638 | 22551.00
```

To view external table partitions, query the [SVV\$1EXTERNAL\$1PARTITIONS](r_SVV_EXTERNAL_PARTITIONS.md) system view.

```
select schemaname, tablename, values, location from svv_external_partitions
where tablename = 'sales_part';
```

```
schemaname | tablename  | values         | location
-----------+------------+----------------+--------------------------------------------------
spectrum   | sales_part | ["2008-01-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-01
spectrum   | sales_part | ["2008-02-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-02
spectrum   | sales_part | ["2008-03-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-03
spectrum   | sales_part | ["2008-04-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-04
spectrum   | sales_part | ["2008-05-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-05
spectrum   | sales_part | ["2008-06-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-06
spectrum   | sales_part | ["2008-07-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-07
spectrum   | sales_part | ["2008-08-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-08
spectrum   | sales_part | ["2008-09-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-09
spectrum   | sales_part | ["2008-10-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-10
spectrum   | sales_part | ["2008-11-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-11
spectrum   | sales_part | ["2008-12-01"] | s3://redshift-downloads/tickit/spectrum/sales_partition/saledate=2008-12
```

## Row format examples
<a name="r_CREATE_EXTERNAL_TABLE_examples-row-format"></a>

The following shows an example of specifying the ROW FORMAT SERDE parameters for data files stored in AVRO format.

```
create external table spectrum.sales(salesid int, listid int, sellerid int, buyerid int, eventid int, dateid int, qtysold int, pricepaid decimal(8,2), comment VARCHAR(255))
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
WITH SERDEPROPERTIES ('avro.schema.literal'='{\"namespace\": \"dory.sample\",\"name\": \"dory_avro\",\"type\": \"record\", \"fields\": [{\"name\":\"salesid\", \"type\":\"int\"},
{\"name\":\"listid\", \"type\":\"int\"},
{\"name\":\"sellerid\", \"type\":\"int\"},
{\"name\":\"buyerid\", \"type\":\"int\"},
{\"name\":\"eventid\",\"type\":\"int\"},
{\"name\":\"dateid\",\"type\":\"int\"},
{\"name\":\"qtysold\",\"type\":\"int\"},
{\"name\":\"pricepaid\", \"type\": {\"type\": \"bytes\", \"logicalType\": \"decimal\", \"precision\": 8, \"scale\": 2}}, {\"name\":\"comment\",\"type\":\"string\"}]}')
STORED AS AVRO
location 's3://amzn-s3-demo-bucket/avro/sales' ;
```

The following shows an example of specifying the ROW FORMAT SERDE parameters using RegEx.

```
create external table spectrum.types(
cbigint bigint,
cbigint_null bigint,
cint int,
cint_null int)
row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'
with serdeproperties ('input.regex'='([^\\x01]+)\\x01([^\\x01]+)\\x01([^\\x01]+)\\x01([^\\x01]+)')
stored as textfile
location 's3://amzn-s3-demo-bucket/regex/types';
```

The following shows an example of specifying the ROW FORMAT SERDE parameters using Grok.

```
create external table spectrum.grok_log(
timestamp varchar(255),
pid varchar(255),
loglevel varchar(255),
progname varchar(255),
message varchar(255))
row format serde 'com.amazonaws.glue.serde.GrokSerDe'
with serdeproperties ('input.format'='[DFEWI], \\[%{TIMESTAMP_ISO8601:timestamp} #%{POSINT:pid:int}\\] *(?<loglevel>:DEBUG|FATAL|ERROR|WARN|INFO) -- +%{DATA:progname}: %{GREEDYDATA:message}')
stored as textfile
location 's3://DOC-EXAMPLE-BUCKET/grok/logs';
```

The following shows an example of defining an Amazon S3 server access log in an S3 bucket. You can use Redshift Spectrum to query Amazon S3 access logs.

```
CREATE EXTERNAL TABLE spectrum.mybucket_s3_logs(
bucketowner varchar(255),
bucket varchar(255),
requestdatetime varchar(2000),
remoteip varchar(255),
requester varchar(255),
requested varchar(255),
operation varchar(255),
key varchar(255),
requesturi_operation varchar(255),
requesturi_key varchar(255),
requesturi_httpprotoversion varchar(255),
httpstatus varchar(255),
errorcode varchar(255),
bytessent bigint,
objectsize bigint,
totaltime varchar(255),
turnaroundtime varchar(255),
referrer varchar(255),
useragent varchar(255),
versionid varchar(255)
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \"([^ ]*)\\s*([^ ]*)\\s*([^ ]*)\" (- |[^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\") ([^ ]*).*$')
LOCATION 's3://amzn-s3-demo-bucket/s3logs’;
```

The following shows an example of specifying the ROW FORMAT SERDE parameters for ION format data.

```
CREATE EXTERNAL TABLE tbl_name (columns)
ROW FORMAT SERDE 'com.amazon.ionhiveserde.IonHiveSerDe'
STORED AS
INPUTFORMAT 'com.amazon.ionhiveserde.formats.IonInputFormat'
OUTPUTFORMAT 'com.amazon.ionhiveserde.formats.IonOutputFormat'
LOCATION 's3://amzn-s3-demo-bucket/prefix'
```

## Data handling examples
<a name="r_CREATE_EXTERNAL_TABLE_examples-data-handling"></a>

The following examples access the file: [spi\$1global\$1rankings.csv](https://s3.amazonaws.com/redshift-downloads/docs-downloads/spi_global_rankings.csv). You can upload the `spi_global_rankings.csv` file to an Amazon S3 bucket to try these examples.

The following example creates the external schema `schema_spectrum_uddh` and database `spectrum_db_uddh`. For `aws-account-id`, enter your AWS account ID and for `role-name` enter your Redshift Spectrum role name.

```
create external schema schema_spectrum_uddh
from data catalog
database 'spectrum_db_uddh'
iam_role 'arn:aws:iam::aws-account-id:role/role-name'
create external database if not exists;
```

The following example creates the external table `soccer_league` in the external schema `schema_spectrum_uddh`.

```
CREATE EXTERNAL TABLE schema_spectrum_uddh.soccer_league
(
  league_rank smallint,
  prev_rank   smallint,
  club_name   varchar(15),
  league_name varchar(20),
  league_off  decimal(6,2),
  league_def  decimal(6,2),
  league_spi  decimal(6,2),
  league_nspi integer
)
ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n\l'
stored as textfile
LOCATION 's3://spectrum-uddh/league/'
table properties ('skip.header.line.count'='1');
```

Check the number of rows in the `soccer_league` table.

```
select count(*) from schema_spectrum_uddh.soccer_league;
```

The numbers of rows displays.

```
count
645
```

The following query displays the top 10 clubs. Because club `Barcelona` has an invalid character in the string, a NULL is displayed for the name.

```
select league_rank,club_name,league_name,league_nspi
from schema_spectrum_uddh.soccer_league
where league_rank between 1 and 10;
```

```
league_rank	club_name	league_name			league_nspi
1		Manchester City	Barclays Premier Lea		34595
2		Bayern Munich	German Bundesliga		34151
3		Liverpool	Barclays Premier Lea		33223
4		Chelsea		Barclays Premier Lea		32808
5		Ajax		Dutch Eredivisie		32790
6		Atletico 	Madrid	Spanish Primera Divi	31517
7		Real Madrid	Spanish Primera Divi		31469
8		NULL	        Spanish Primera Divi            31321
9		RB Leipzig	German Bundesliga		31014
10		Paris Saint-Ger	French Ligue 1			30929
```

The following example alters the `soccer_league` table to specify the `invalid_char_handling`, `replacement_char`, and `data_cleansing_enabled` external table properties to insert a question mark (?) as a substitute for unexpected characters.

```
alter  table schema_spectrum_uddh.soccer_league
set table properties ('invalid_char_handling'='REPLACE','replacement_char'='?','data_cleansing_enabled'='true');
```

The following example queries the table `soccer_league` for teams with a rank from 1 to 10.

```
select league_rank,club_name,league_name,league_nspi
from schema_spectrum_uddh.soccer_league
where league_rank between 1 and 10;
```

Because the table properties were altered, the results show the top 10 clubs, with the question mark (?) replacement character in the eighth row for club `Barcelona`.

```
league_rank	club_name	league_name		league_nspi
1		Manchester City	Barclays Premier Lea	34595
2		Bayern Munich	German Bundesliga	34151
3		Liverpool	Barclays Premier Lea	33223
4		Chelsea		Barclays Premier Lea	32808
5		Ajax		Dutch Eredivisie	32790
6		Atletico Madrid	Spanish Primera Divi	31517
7		Real Madrid	Spanish Primera Divi	31469
8		Barcel?na	Spanish Primera Divi	31321
9		RB Leipzig	German Bundesliga	31014
10		Paris Saint-Ger	French Ligue 1		30929
```

The following example alters the `soccer_league` table to specify the `invalid_char_handling` external table properties to drop rows with unexpected characters.

```
alter table schema_spectrum_uddh.soccer_league
set table properties ('invalid_char_handling'='DROP_ROW','data_cleansing_enabled'='true');
```

The following example queries the table `soccer_league` for teams with a rank from 1 to 10.

```
select league_rank,club_name,league_name,league_nspi
from schema_spectrum_uddh.soccer_league
where league_rank between 1 and 10;
```

The results display the top clubs, not including the eighth row for club `Barcelona`.

```
league_rank   club_name         league_name            league_nspi
1             Manchester City   Barclays Premier Lea   34595
2             Bayern Munich     German Bundesliga      34151
3             Liverpool         Barclays Premier Lea   33223
4             Chelsea           Barclays Premier Lea   32808
5             Ajax              Dutch Eredivisie       32790
6             Atletico Madrid   Spanish Primera Divi   31517
7             Real Madrid       Spanish Primera Divi   31469
9             RB Leipzig        German Bundesliga      31014
10            Paris Saint-Ger   French Ligue 1         30929
```

# CREATE EXTERNAL VIEW
<a name="r_CREATE_EXTERNAL_VIEW"></a>

The Data Catalog views preview feature is available only in the following Regions.
+ US East (Ohio) (us-east-2)
+ US East (N. Virginia) (us-east-1)
+ US West (N. California) (us-west-1)
+ Asia Pacific (Tokyo) (ap-northeast-1)
+ Europe (Ireland) (eu-west-1)
+ Europe (Stockholm) (eu-north-1)

Creates a view in the Data Catalog. Data Catalog views are a single view schema that works with other SQL engines such as Amazon Athena and Amazon EMR. You can query the view from your choice of engine. For more information about Data Catalog views, see [Creating Data Catalog views](https://docs.aws.amazon.com/redshift/latest/dg/data-catalog-views-overview.html).

## Syntax
<a name="r_CREATE_EXTERNAL_VIEW-synopsis"></a>

```
CREATE EXTERNAL VIEW schema_name.view_name [ IF NOT EXISTS ]
{catalog_name.schema_name.view_name | awsdatacatalog.dbname.view_name | external_schema_name.view_name}
AS query_definition;
```

## Parameters
<a name="r_CREATE_EXTERNAL_VIEW-parameters"></a>

 *schema\$1name.view\$1name*   
The schema that’s attached to your AWS Glue database, followed by the name of the view.

PROTECTED  
Specifies that the CREATE EXTERNAL VIEW command should only complete if the query within the query\$1definition can successfully complete.

IF NOT EXISTS  
Creates the view if the view doesn’t already exist.

catalog\$1name.schema\$1name.view\$1name \$1 awsdatacatalog.dbname.view\$1name \$1 external\$1schema\$1name.view\$1name  
The notation of the schema to use when creating the view. You can specify to use the AWS Glue Data Catalog, a Glue database that you created, or an external schema that you created. See [CREATE DATABASE](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_DATABASE.html) and [CREATE EXTERNAL SCHEMA ](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html)for more information.

 *query\$1definition*   
The definition of the SQL query that Amazon Redshift runs to alter the view.

## Examples
<a name="r_CREATE_EXTERNAL_VIEW-examples"></a>

The following example creates a Data Catalog view named sample\$1schema.glue\$1data\$1catalog\$1view.

```
CREATE EXTERNAL PROTECTED VIEW sample_schema.glue_data_catalog_view IF NOT EXISTS
AS SELECT * FROM sample_database.remote_table "remote-table-name";
```

# CREATE FUNCTION
<a name="r_CREATE_FUNCTION"></a>

Creates a new scalar user-defined function (UDF) using either a SQL SELECT clause or a Python program.

For more information and examples, see [User-defined functions in Amazon Redshift](user-defined-functions.md).

## Required privileges
<a name="r_CREATE_FUNCTION-privileges"></a>

You must have permission by one of the following ways to run CREATE OR REPLACE FUNCTION:
+ For CREATE FUNCTION:
  + Superuser can use both trusted and untrusted languages to create functions.
  + Users with the CREATE [ OR REPLACE ] FUNCTION privilege can create functions with trusted languages.
+ For REPLACE FUNCTION:
  + Superuser
  + Users with the CREATE [ OR REPLACE ] FUNCTION privilege
  + Function owner

## Syntax
<a name="r_CREATE_FUNCTION-synopsis"></a>

```
CREATE [ OR REPLACE ] FUNCTION f_function_name
( { [py_arg_name  py_arg_data_type |
sql_arg_data_type } [ , ... ] ] )
RETURNS data_type
{ VOLATILE | STABLE | IMMUTABLE }
AS $$
  { python_program | SELECT_clause }
$$ LANGUAGE { plpythonu | sql }
```

## Parameters
<a name="r_CREATE_FUNCTION-parameters"></a>

OR REPLACE  
Specifies that if a function with the same name and input argument data types, or *signature*, as this one already exists, the existing function is replaced. You can only replace a function with a new function that defines an identical set of data types. You must be a superuser to replace a function.  
If you define a function with the same name as an existing function but a different signature, you create a new function. In other words, the function name is overloaded. For more information, see [Overloading function names](udf-naming-udfs.md#udf-naming-overloading-function-names).

 *f\$1function\$1name*   
The name of the function. If you specify a schema name (such as `myschema.myfunction`), the function is created using the specified schema. Otherwise, the function is created in the current schema. For more information about valid names, see [Names and identifiers](r_names.md).  
We recommend that you prefix all UDF names with `f_`. Amazon Redshift reserves the `f_` prefix for UDF names, so by using the `f_` prefix, you ensure that your UDF name will not conflict with any existing or future Amazon Redshift built-in SQL function names. For more information, see [Preventing UDF naming conflicts](udf-naming-udfs.md).  
You can define more than one function with the same function name if the data types for the input arguments are different. In other words, the function name is overloaded. For more information, see [Overloading function names](udf-naming-udfs.md#udf-naming-overloading-function-names).

 *py\$1arg\$1name py\$1arg\$1data\$1type \$1 sql\$1arg\$1data\$1type*   
For a Python UDF, a list of input argument names and data types. For a SQL UDF, a list of data types, without argument names. In a Python UDF, refer to arguments using the argument names. In a SQL UDF, refer to arguments using \$11, \$12, and so on, based on the order of the arguments in the argument list.   
For a SQL UDF, the input and return data types can be any standard Amazon Redshift data type. For a Python UDF, the input and return data types can be SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE PRECISION, BOOLEAN, CHAR, VARCHAR, DATE, or TIMESTAMP. In addition, Python user-defined functions (UDFs) support a data type of ANYELEMENT. This is automatically converted to a standard data type based on the data type of the corresponding argument supplied at runtime. If multiple arguments use ANYELEMENT, they all resolve to the same data type at runtime, based on the first ANYELEMENT argument in the list. For more information, see [Python UDF data types](udf-data-types.md) and [Data types](c_Supported_data_types.md).  
You can specify a maximum of 32 arguments.

 RETURNS *data\$1type*   
The data type of the value returned by the function. The RETURNS data type can be any standard Amazon Redshift data type. In addition, Python UDFs can use a data type of ANYELEMENT, which is automatically converted to a standard data type based on the argument supplied at runtime. If you specify ANYELEMENT for the return data type, at least one argument must use ANYELEMENT. The actual return data type matches the data type supplied for the ANYELEMENT argument when the function is called. For more information, see [Python UDF data types](udf-data-types.md).

 VOLATILE \$1 STABLE \$1 IMMUTABLE   
Informs the query optimizer about the volatility of the function.   
You will get the best optimization if you label your function with the strictest volatility category that is valid for it. However, if the category is too strict, there is a risk that the optimizer will erroneously skip some calls, resulting in an incorrect result set. In order of strictness, beginning with the least strict, the volatility categories are as follows:  
+ VOLATILE
+ STABLE
+ IMMUTABLE
VOLATILE  
Given the same arguments, the function can return different results on successive calls, even for the rows in a single statement. The query optimizer can't make any assumptions about the behavior of a volatile function, so a query that uses a volatile function must reevaluate the function for every input row.  
STABLE  
Given the same arguments, the function is guaranteed to return the same results for all rows processed within a single statement. The function can return different results when called in different statements. This category allows the optimizer to optimize multiple calls of the function within a single statement to a single call for the statement.   
IMMUTABLE  
Given the same arguments, the function always returns the same result, forever. When a query calls an `IMMUTABLE` function with constant arguments, the optimizer pre-evaluates the function.

AS \$1\$1 *statement* \$1\$1  
 A construct that encloses the statement to be run. The literal keywords `AS $$` and `$$` are required.   
Amazon Redshift requires you to enclose the statement in your function by using a format called dollar quoting. Anything within the enclosure is passed exactly as is. You don't need to escape any special characters because the contents of the string are written literally.   
 With *dollar quoting, *you use a pair of dollar signs (\$1\$1) to signify the start and the end of the statement to run, as shown in the following example.   

```
$$ my statement $$
```
 Optionally, between the dollar signs in each pair, you can specify a string to help identify the statement. The string that you use must be the same in both the start and the end of the enclosure pairs. This string is case-sensitive, and it follows the same constraints as an unquoted identifier except that it can't contain dollar signs. The following example uses the string `test`.   

```
$test$ my statement $test$
```
For more information about dollar quoting, see "Dollar-quoted String Constants" under [ Lexical Structure](https://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html) in the PostgreSQL documentation. 

*python\$1program*   
A valid executable Python program that returns a value. The statement that you pass in with the function must conform to indentation requirements as specified in the [Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/#indentation) on the Python website. For more information, see [Python language support for UDFs](udf-python-language-support.md).

*SQL\$1clause*   
A SQL SELECT clause.  
The SELECT clause can't include any of the following types of clauses:  
+ FROM
+ INTO
+ WHERE
+ GROUP BY
+ ORDER BY
+ LIMIT

LANGUAGE \$1 plpythonu \$1 sql \$1   
For Python, specify `plpythonu`. For SQL, specify `sql`. You must have permission for usage on language for SQL or plpythonu. For more information, see [UDF security and permissions](udf-security-and-privileges.md).

## Usage notes
<a name="r_CREATE_FUNCTION-usage-notes"></a>

### Nested functions
<a name="r_CREATE_FUNCTION-usage-notes-nested-functions"></a>

You can call another SQL user-defined function (UDF) from within a SQL UDF. The nested function must exist when you run the CREATE FUNCTION command. Amazon Redshift doesn't track dependencies for UDFs, so if you drop the nested function, Amazon Redshift doesn't return an error. However, the UDF will fail if the nested function doesn't exist. For example, the following function calls the `f_sql_greater `function in the SELECT clause.

```
create function f_sql_commission (float, float )
  returns float
stable
as $$
  select f_sql_greater ($1, $2)
$$ language sql;
```

### UDF security and privileges
<a name="r_CREATE_FUNCTION-usage-notes-security-and-privileges"></a>

To create a UDF, you must have permission for usage on language for SQL or plpythonu (Python). By default, USAGE ON LANGUAGE SQL is granted to PUBLIC. However, you must explicitly grant USAGE ON LANGUAGE PLPYTHONU to specific users or groups. 

To revoke usage for SQL, first revoke usage from PUBLIC. Then grant usage on SQL only to the specific users or groups permitted to create SQL UDFs. The following example revokes usage on SQL from PUBLIC then grants usage to the user group `udf_devs`.

```
revoke usage on language sql from PUBLIC;
grant usage on language sql to group udf_devs;
```

To run a UDF, you must have execute permission for each function. By default, execute permission for new UDFs is granted to PUBLIC. To restrict usage, revoke execute permission from PUBLIC for the function. Then grant the privilege to specific individuals or groups. 

The following example revokes execute permission on function `f_py_greater` from PUBLIC then grants usage to the user group `udf_devs`.

```
revoke execute on function f_py_greater(a float, b float) from PUBLIC;
grant execute on function f_py_greater(a float, b float) to group udf_devs;
```

Superusers have all privileges by default. 

For more information, see [GRANT](r_GRANT.md) and [REVOKE](r_REVOKE.md).

## Examples
<a name="r_CREATE_FUNCTION-examples"></a>

### Scalar Python UDF example
<a name="r_CREATE_FUNCTION-python-example"></a>

The following example creates a Python UDF that compares two integers and returns the larger value.

```
create function f_py_greater (a float, b float)
  returns float
stable
as $$
  if a > b:
    return a
  return b
$$ language plpythonu;
```

The following example queries the SALES table and calls the new `f_py_greater` function to return either COMMISSION or 20 percent of PRICEPAID, whichever is greater.

```
select f_py_greater (commission, pricepaid*0.20) from sales;
```

### Scalar SQL UDF example
<a name="r_CREATE_FUNCTION-sql-example"></a>

The following example creates a function that compares two numbers and returns the larger value. 

```
create function f_sql_greater (float, float)
  returns float
stable
as $$
  select case when $1 > $2 then $1
    else $2
  end
$$ language sql;
```

The following query calls the new `f_sql_greater` function to query the SALES table and returns either COMMISSION or 20 percent of PRICEPAID, whichever is greater.

```
select f_sql_greater (commission, pricepaid*0.20) from sales;
```

# CREATE GROUP
<a name="r_CREATE_GROUP"></a>

Defines a new user group. Only a superuser can create a group.

## Syntax
<a name="r_CREATE_GROUP-synopsis"></a>

```
CREATE GROUP group_name
[ [ WITH ] [ USER username ] [, ...] ]
```

## Parameters
<a name="r_CREATE_GROUP-parameters"></a>

 *group\$1name*   
Name of the new user group. Group names beginning with two underscores are reserved for Amazon Redshift internal use. For more information about valid names, see [Names and identifiers](r_names.md).

WITH  
Optional syntax to indicate additional parameters for CREATE GROUP.

USER  
Add one or more users to the group.

 *username*   
Name of the user to add to the group.

## Examples
<a name="r_CREATE_GROUP-examples"></a>

The following example creates a user group named ADMIN\$1GROUP with a two users, ADMIN1 and ADMIN2.

```
create group admin_group with user admin1, admin2;
```

# CREATE IDENTITY PROVIDER
<a name="r_CREATE_IDENTITY_PROVIDER"></a>

Defines a new identity provider. Only a superuser can create an identity provider.

## Syntax
<a name="r_CREATE_IDENTITY_PROVIDER-synopsis"></a>

```
CREATE IDENTITY PROVIDER identity_provider_name TYPE type_name
NAMESPACE namespace_name
[PARAMETERS parameter_string]
[APPLICATION_ARN arn]
[IAM_ROLE iam_role]
[AUTO_CREATE_ROLES
    [ TRUE [ { INCLUDE | EXCLUDE } GROUPS LIKE filter_pattern] |
      FALSE
    ]
  ];
```

## Parameters
<a name="r_CREATE_IDENTITY_PROVIDER-parameters"></a>

 *identity\$1provider\$1name*   
Name of the new identity provider. For more information about valid names, see [Names and identifiers](r_names.md).

*type\$1name*  
The identity provider to interface with. Azure and AWSIDC are currently the only supported identity providers.

*namespace\$1name*  
The namespace. This is a unique, shorthand identifier for the identity provider directory.

 *parameter\$1string*   
A string containing a properly formatted JSON object that contains parameters and values required for the identity provider.

 *arn*   
The Amazon resource name (ARN) for an IAM Identity Center managed application. This parameter is applicable only when the identity-provider type is AWSIDC.

 *iam\$1role*   
The IAM role that provides permissions to make the connection to IAM Identity Center. This parameter is applicable only when the identity-provider type is AWSIDC.

 *auto\$1create\$1roles*   
Enables or disables the auto-create role feature. If the value is TRUE, Amazon Redshift enables the auto-create role feature. If the value is FALSE, Amazon Redshift disables the auto-create role feature. If the value for this parameter isn't specified, Amazon Redshift determines the value using the following logic:   
+  If `AUTO_CREATE_ROLES` is provided but the value isn't specified, the value is set to TRUE. 
+  If `AUTO_CREATE_ROLES` isn't provided and the identity provider is AWSIDC, the value is set to FALSE. 
+  If `AUTO_CREATE_ROLES` isn't provided and the identity provider is Azure, the value is set to TRUE. 
To include groups, specify `INCLUDE`. The default is empty, which means include all groups if `AUTO_CREATE_ROLES` is on.  
To exclude groups, specify `EXCLUDE`. The default is empty, which means do not exclude any groups if `AUTO_CREATE_ROLES` is on.

 *filter\$1pattern*   
A valid UTF-8 character expression with a pattern to match group names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_IDENTITY_PROVIDER.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator.   
*filter\$1pattern* supports the following characters:  
+  Uppercase and lowercase alphabetic characters (A-Z and a-z) 
+  Numerals (0-9) 
+  The following special characters: 

  ```
  _ % ^ * + ? { } , $
  ```

## Examples
<a name="r_CREATE_IDENTITY_PROVIDER-examples"></a>

The following example creates an identity provider named *oauth\$1standard*, with a TYPE *azure*, to establish communication with Microsoft Azure Active Directory (AD).

```
CREATE IDENTITY PROVIDER oauth_standard TYPE azure
NAMESPACE 'aad'
PARAMETERS '{"issuer":"https://sts.windows.net/2sdfdsf-d475-420d-b5ac-667adad7c702/",
"client_id":"87f4aa26-78b7-410e-bf29-57b39929ef9a",
"client_secret":"BUAH~ewrqewrqwerUUY^%tHe1oNZShoiU7",
"audience":["https://analysis.windows.net/powerbi/connector/AmazonRedshift"]
}'
```

You can connect an IAM Identity Center managed application with an existing provisioned cluster or Amazon Redshift Serverless workgroup. This gives you the ability to manage access to a Redshift database through IAM Identity Center. To do so, run a SQL command like the following sample. You have to be a database administrator.

```
CREATE IDENTITY PROVIDER "redshift-idc-app" TYPE AWSIDC
NAMESPACE 'awsidc'
APPLICATION_ARN 'arn:aws:sso::123456789012:application/ssoins-12345f67fe123d4/apl-a0b0a12dc123b1a4'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyRedshiftRole';
```

The application ARN in this case identifies the managed application to connect to. You can find it by running `SELECT * FROM SVV_IDENTITY_PROVIDERS;`.

For more information about using CREATE IDENTITY PROVIDER, including additional examples, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html). For more information about setting up a connection to IAM Identity Center from Redshift, see [Connect Redshift with IAM Identity Center to give users a single sign-on experience](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-idp-connect.html).

# CREATE LIBRARY
<a name="r_CREATE_LIBRARY"></a>

Installs a Python library, which is available for users to incorporate when creating a user-defined function (UDF) with the [CREATE FUNCTION](r_CREATE_FUNCTION.md) command. The total size of user-installed libraries can't exceed 100 MB. 

CREATE LIBRARY can't be run inside a transaction block (BEGIN … END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

Amazon Redshift supports Python version 2.7. For more information, see [www.python.org](https://www.python.org/).

For more information, see [Example: Importing custom Python library modules](udf-importing-custom-python-library-modules.md). 

## Required privileges
<a name="r_CREATE_LIBRARY-privileges"></a>

Following are required privileges for CREATE LIBRARY:
+ Superuser
+ Users with the CREATE LIBRARY privilege or with the privilege of the specified language

## Syntax
<a name="r_CREATE_LIBRARY-synopsis"></a>

```
CREATE [ OR REPLACE ] LIBRARY library_name LANGUAGE plpythonu
FROM
{ 'https://file_url'
| 's3://bucketname/file_name'
authorization
  [ REGION [AS] 'aws_region']
  IAM_ROLE { default | ‘arn:aws:iam::<AWS account-id>:role/<role-name>’ }
}
```

## Parameters
<a name="r_CREATE_LIBRARY-parameters"></a>

OR REPLACE  
Specifies that if a library with the same name as this one already exists, the existing library is replaced. REPLACE commits immediately. If a UDF that depends on the library is running concurrently, the UDF might fail or return unexpected results, even if the UDF is running within a transaction. You must be the owner or a superuser to replace a library.

 *library\$1name*   
The name of the library to be installed. You can't create a library that contains a module with the same name as a Python Standard Library module or an Amazon Redshift preinstalled Python module. If an existing user-installed library uses the same Python package as the library to be installed, you must drop the existing library before installing the new library. For more information, see [Python language support for UDFs](udf-python-language-support.md).

LANGUAGE plpythonu  
The language to use. Python (plpythonu) is the only supported language. Amazon Redshift supports Python version 2.7. For more information, see [www.python.org](https://www.python.org/).

FROM  
The location of the library file. You can specify an Amazon S3 bucket and object name, or you can specify a URL to download the file from a public website. The library must be packaged in the form of a `.zip` file. For more information, see [Building and Installing Python Modules](https://docs.python.org/2/library/distutils.html?highlight=distutils#module-distutils) in the Python documentation.

 https://*file\$1url*   
The URL to download the file from a public website. The URL can contain up to three redirects. The following is an example of a file URL.  

```
'https://www.example.com/pylib.zip'
```

 s3://*bucket\$1name/file\$1name*   
The path to a single Amazon S3 object that contains the library file. The following is an example of an Amazon S3 object path.  

```
's3://amzn-s3-demo-bucket/my-pylib.zip'
```
If you specify an Amazon S3 bucket, you must also provide credentials for an AWS user that has permission to download the file.   
 If the Amazon S3 bucket doesn't reside in the same AWS Region as your Amazon Redshift cluster, you must use the REGION option to specify the AWS Region in which the data is located. The value for *aws\$1region* must match an AWS Region listed in the table in the [REGION](copy-parameters-data-source-s3.md#copy-region) parameter description for the COPY command.

*authorization*   
A clause that indicates the method your cluster uses for authentication and authorization to access the Amazon S3 bucket that contains the library file. Your cluster must have permission to access the Amazon S3 with the LIST and GET actions.  
The syntax for authorization is the same as for the COPY command authorization. For more information, see [Authorization parameters](copy-parameters-authorization.md).  

```
IAM_ROLE { default | ‘arn:aws:iam::<AWS account-id>:role/<role-name>’
```
 Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE LIBRARY command runs.  
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. If you specify IAM\$1ROLE, you can't use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, SESSION\$1TOKEN, or CREDENTIALS.  
Optionally, if the Amazon S3 bucket uses server-side encryption, provide the encryption key in the credentials-args string. If you use temporary security credentials, provide the temporary token in the *credentials-args* string.  
For more information, see [Temporary security credentials](copy-usage_notes-access-permissions.md#r_copy-temporary-security-credentials).

 REGION [AS] *aws\$1region*   
The AWS Region where the Amazon S3 bucket is located. REGION is required when the Amazon S3 bucket isn't in the same AWS Region as the Amazon Redshift cluster. The value for *aws\$1region* must match an AWS Region listed in the table in the [REGION](copy-parameters-data-source-s3.md#copy-region) parameter description for the COPY command.  
By default, CREATE LIBRARY assumes that the Amazon S3 bucket is located in the same AWS Region as the Amazon Redshift cluster.

## Examples
<a name="r_CREATE_LIBRARY-examples"></a>

The following two examples install the [urlparse](https://docs.python.org/2/library/urlparse.html#module-urlparse) Python module, which is packaged in a file named `urlparse3-1.0.3.zip`. 

The following command installs a UDF library named `f_urlparse` from a package that has been uploaded to an Amazon S3 bucket located in the US East Region.

```
create library f_urlparse
language plpythonu
from 's3://amzn-s3-demo-bucket/urlparse3-1.0.3.zip'
credentials 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>'
region as 'us-east-1';
```

The following example installs a library named `f_urlparse` from a library file on a website.


```
create library f_urlparse
language plpythonu
from 'https://example.com/packages/urlparse3-1.0.3.zip';
```

# CREATE MASKING POLICY
<a name="r_CREATE_MASKING_POLICY"></a>

Creates a new dynamic data masking policy to obfuscate data of a given format. For more information on dynamic data masking, see [Dynamic data masking](t_ddm.md).

Superusers and users or roles that have the sys:secadmin role can create a masking policy.

## Syntax
<a name="r_CREATE_MASKING_POLICY-synopsis"></a>

```
CREATE MASKING POLICY 
   { policy_name | database_name.policy_name } [IF NOT EXISTS]
   WITH (input_columns)
   USING (masking_expression);
```

## Parameters
<a name="r_CREATE_MASKING_POLICY-parameters"></a>

 *policy\$1name*   
The name of the masking policy. The masking policy can't have the same name as another masking policy that already exists in the database.

database\$1name  
The name of the database where the policy will be created. Policy can be created on the connected database or on Amazon Redshift Federated Permissions Catalog.

*input\$1columns*   
A tuple of column names in the format (col1 type, col2 type ...).  
Column names are used as the input for the masking expression. Column names don't have to match the names of the columns being masked, but the input and output data types must match.

*masking\$1expression*  
The SQL expression used to transform the target columns. It can be written using data manipulation functions such as String manipulation functions, or in conjunction with user-defined functions written in SQL, Python, or with AWS Lambda. You can include a tuple of column expressions for masking policies that have multiple outputs. If you use a constant as your masking expression, you must explicitly cast it to a type that matches the input type.  
 You must have the USAGE permission on any user-defined functions that you use in the masking expression. 

For the usage of CREATE MASKING POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

# CREATE MATERIALIZED VIEW
<a name="materialized-view-create-sql-command"></a>

Creates a materialized view based on one or more Amazon Redshift tables. You can also base materialized views on external tables created using Spectrum or federated query. For information about Spectrum, see [Amazon Redshift Spectrum](c-using-spectrum.md). For information about federated query, see [Querying data with federated queries in Amazon Redshift](federated-overview.md).

## Syntax
<a name="mv_CREATE_MATERIALIZED_VIEW-synopsis"></a>

```
CREATE MATERIALIZED VIEW mv_name
[ BACKUP { YES | NO } ]
[ table_attributes ]
[ AUTO REFRESH { YES | NO } ]
AS query
```

## Parameters
<a name="mv_CREATE_MATERIALIZED_VIEW-parameters"></a>

BACKUP  
A clause that specifies whether the materialized view should be included in automated and manual cluster snapshots.   
For materialized views that don't contain critical data, specify BACKUP NO to save processing time when creating snapshots and restoring from snapshots and to reduce storage space on Amazon Simple Storage Service. The BACKUP NO setting has no affect on automatic replication of data to other nodes within the cluster, so materialized views with BACKUP NO specified are restored in the event of a node failure. The default is BACKUP YES.

 *table\$1attributes*   
A clause that specifies how the data in the materialized view is distributed, including the following:  
+  The distribution style for the materialized view, in the format `DISTSTYLE { EVEN | ALL | KEY }`. If you omit this clause, the distribution style is `EVEN`. For more information, see [Distribution styles](c_choosing_dist_sort.md).
+ The distribution key for the materialized view, in the format `DISTKEY ( distkey_identifier )`. For more information, see [Designating distribution styles](t_designating_distribution_styles.md).
+ The sort key for the materialized view, in the format `SORTKEY ( column_name [, ...] )`. For more information, see [Sort keys](t_Sorting_data.md).

AS *query*  
A valid `SELECT` statement that defines the materialized view and its content. The result set from the query defines the columns and rows of the materialized view. For information about limitations when creating materialized views, see [Limitations](#mv_CREATE_MATERIALIZED_VIEW-limitations).  
Furthermore, specific SQL language constructs used in the query determines whether the materialized view can be incrementally or fully refreshed. For information about the refresh method, see [REFRESH MATERIALIZED VIEW](materialized-view-refresh-sql-command.md). For information about the limitations for incremental refresh, see [Limitations for incremental refresh](materialized-view-refresh-sql-command.md#mv_REFRESH_MARTERIALIZED_VIEW_limitations).  
If the query contains an SQL command that doesn't support incremental refresh, Amazon Redshift displays a message indicating that the materialized view will use a full refresh. The message may or may not be displayed, depending on the SQL client application. Check the `state` column of the [STV\$1MV\$1INFO](r_STV_MV_INFO.md) to see the refresh type used by a materialized view.

AUTO REFRESH  
A clause that defines whether the materialized view should be automatically refreshed with latest changes from its base tables. The default value is `NO`. For more information, see [Refreshing a materialized view](materialized-view-refresh.md).

## Usage notes
<a name="mv_CREATE_MARTERIALIZED_VIEW_usage"></a>

To create a materialized view, you must have the following privileges:
+ CREATE privileges for a schema.
+ Table-level or column-level SELECT privilege on the base tables to create a materialized view. If you have column-level privileges on specific columns, you can create a materialized view on only those columns.

 You can create a materialized view from a remote datasharing cluster by providing the external database name at the `mv_name`. 

## Incremental refresh for materialized views in a datashare
<a name="mv_CREATE_MARTERIALIZED_VIEW_datashare"></a>

 Amazon Redshift supports automatic and incremental refresh for materialized views in a consumer datashare when the base tables are shared. Incremental refresh is an operation where Amazon Redshift identifies changes in the base table or tables that happened after the previous refresh and updates only the corresponding records in the materialized view. This runs more quickly than a full refresh and improves workload performance. You don't have to change your materialized-view definition to take advantage of incremental refresh. 

There are a couple limitations to note for taking advantage of incremental refresh with a materialized view: 
+ The materialized view must reference only one database, either local or remote. 
+ Incremental refresh is available only on new materialized views. Therefore, you must drop existing materialized views and recreate them for incremental refresh to occur.

For more information about creating materialized views in a datashare, see [Working with views in Amazon Redshift data sharing](https://docs.aws.amazon.com/redshift/latest/dg/datashare-views), which contains several query examples.

## DDL updates to materialized views or base tables
<a name="materialized-view-ddl"></a>

When using materialized views in Amazon Redshift, follow these usage notes for data definition language (DDL) updates to materialized views or base tables.
+ You can add columns to a base table without affecting any materialized views that reference the base table.
+ Some operations can leave the materialized view in a state that can't be refreshed at all. Examples are operations such as renaming or dropping a column, changing the type of a column, and changing the name of a schema. Such materialized views can be queried but can't be refreshed. In this case, you must drop and recreate the materialized view. 
+ In general, you can't alter a materialized view's definition (its SQL statement).
+ You can't rename a materialized view. 

## Limitations
<a name="mv_CREATE_MATERIALIZED_VIEW-limitations"></a>

You can't define a materialized view that references or includes any of the following:
+ Standard views, or system tables and views.
+ Temporary tables.
+ User-defined functions.
+ The ORDER BY, LIMIT, or OFFSET clause.
+ Late-binding references to base tables. In other words, any base tables or related columns referenced in the defining SQL query of the materialized view must exist and must be valid. 
+ Leader node-only functions: CURRENT\$1SCHEMA, CURRENT\$1SCHEMAS, HAS\$1DATABASE\$1PRIVILEGE, HAS\$1SCHEMA\$1PRIVILEGE, HAS\$1TABLE\$1PRIVILEGE.
+ You can't use the AUTO REFRESH YES option when the materialized view definition includes mutable functions or external schemas. You also can't use it when you define a materialized view on another materialized view.
+ You don't have to manually run [ANALYZE](r_ANALYZE.md) on materialized views. This happens currently only via AUTO ANALYZE. For more information, see [Analyzing tables](t_Analyzing_tables.md).
+ RLS-protected or DDM-protected tables. 
+ Materialized view creation from remote datasharing clusters does not support references on other materialized views, Spectrum tables, tables defined in a different Redshift cluster and UDFs. These are supported for materialized view creation from the local (producer) cluster. 

## Examples
<a name="mv_CREATE_MARTERIALIZED_VIEW_examples"></a>

The following example creates a materialized view from three base tables that are joined and aggregated. Each row represents a category with the number of tickets sold. When you query the tickets\$1mv materialized view, you directly access the precomputed data in the tickets\$1mv materialized view.

```
CREATE MATERIALIZED VIEW tickets_mv AS
    select   catgroup,
    sum(qtysold) as sold
    from     category c, event e, sales s
    where    c.catid = e.catid
    and      e.eventid = s.eventid
    group by catgroup;
```

The following example creates a materialized view similar to the previous example and uses the aggregate function MAX(). 

```
CREATE MATERIALIZED VIEW tickets_mv_max AS
    select   catgroup,
    max(qtysold) as sold
    from     category c, event e, sales s
    where    c.catid = e.catid
    and      e.eventid = s.eventid
    group by catgroup;

SELECT name, state FROM STV_MV_INFO;
```

The following example uses a UNION ALL clause to join the Amazon Redshift `public_sales` table and the Redshift Spectrum `spectrum.sales` table to create a material view `mv_sales_vw`. For information about the CREATE EXTERNAL TABLE command for Amazon Redshift Spectrum, see [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md). The Redshift Spectrum external table references the data on Amazon S3.

```
CREATE MATERIALIZED VIEW mv_sales_vw as
select salesid, qtysold, pricepaid, commission, saletime from public.sales
union all
select salesid, qtysold, pricepaid, commission, saletime from spectrum.sales
```

The following example creates a materialized view `mv_fq` based on a federated query external table. For information about federated query, see [CREATE EXTERNAL SCHEMA](r_CREATE_EXTERNAL_SCHEMA.md).

```
CREATE MATERIALIZED VIEW mv_fq as select firstname, lastname from apg.mv_fq_example;

select firstname, lastname from mv_fq;
 firstname | lastname
-----------+----------
 John      | Day
 Jane      | Doe
(2 rows)
```

The following example shows the definition of a materialized view.

```
SELECT pg_catalog.pg_get_viewdef('mv_sales_vw'::regclass::oid, true);

pg_get_viewdef
---------------------------------------------------
create materialized view mv_sales_vw as select a from t;
```

 The following sample shows how to set AUTO REFRESH in the materialized view definition and also specifies a DISTSTYLE. First, create a simple base table. 

```
CREATE TABLE baseball_table (ball int, bat int);
```

Then, create a materialized view.

```
CREATE MATERIALIZED VIEW mv_baseball DISTSTYLE ALL AUTO REFRESH YES AS SELECT ball AS baseball FROM baseball_table;
```

Now you can query the mv\$1baseball materialized view. To check if AUTO REFRESH is turned on for a materialized view, see [STV\$1MV\$1INFO](r_STV_MV_INFO.md).

The following sample creates a materialized view that references a source table in another database. It assumes that the database containing the source table, database\$1A, is in the same cluster or workgroup as your materialized view, which you create in database\$1B. (You can substitute your own databases for the sample.) First, create a table in database\$1A called *cities*, with a *cityname* column. Make the column's data type a VARCHAR. After you create the source table, run the following command in database\$1B to create a materialized view whose source is your *cities* table. Make sure to specify the source table's database and schema in the FROM clause:

```
CREATE MATERIALIZED VIEW cities_mv AS
SELECT  cityname
FROM    database_A.public.cities;
```

Query the materialized view you created. The query retrieves records whose original source is the *cities* table in database\$1A:

```
select * from cities_mv;
```

When you run the SELECT statement, *cities\$1mv* returns the records. Records are refreshed from the source table only when a REFRESH statement is run. Also, note that you can't update records directly in the materialized view. For information about refreshing the data in a materialized view, see [REFRESH MATERIALIZED VIEW](materialized-view-refresh-sql-command.md).

For details about materialized view overview and SQL commands used to refresh and drop materialized views, see the following topics:
+ [Materialized views in Amazon Redshift](materialized-view-overview.md)
+ [REFRESH MATERIALIZED VIEW](materialized-view-refresh-sql-command.md)
+ [DROP MATERIALIZED VIEW](materialized-view-drop-sql-command.md)

# CREATE MODEL
<a name="r_CREATE_MODEL"></a>

**Topics**
+ [Prerequisites](#r_create_model_prereqs)
+ [Required privileges](#r_simple_create_model-privileges)
+ [Cost control](#r_create_model_cost)
+ [Full CREATE MODEL](#r_full_create_model)
+ [Parameters](#r_create_model_parameters)
+ [Usage notes](r_create_model_usage_notes.md)
+ [Use cases](r_create_model_use_cases.md)

## Prerequisites
<a name="r_create_model_prereqs"></a>

Before you use the CREATE MODEL statement, complete the prerequisites in [Cluster setup for using Amazon Redshift ML](getting-started-machine-learning.md#cluster-setup). The following is a high-level summary of the prerequisites.
+ Create an Amazon Redshift cluster with the AWS Management Console or the AWS Command Line Interface (AWS CLI).
+ Attach the AWS Identity and Access Management (IAM) policy while creating the cluster.
+ To allow Amazon Redshift and SageMaker AI to assume the role to interact with other services, add the appropriate trust policy to the IAM role.

For details for the IAM role, trust policy, and other prerequisites, see [Cluster setup for using Amazon Redshift ML](getting-started-machine-learning.md#cluster-setup).

Following, you can find different use cases for the CREATE MODEL statement.
+ [Simple CREATE MODEL](r_create_model_use_cases.md#r_simple_create_model)
+ [CREATE MODEL with user guidance](r_create_model_use_cases.md#r_user_guidance_create_model)
+ [CREATE XGBoost models with AUTO OFF](r_create_model_use_cases.md#r_auto_off_create_model)
+ [Bring your own model (BYOM) - local inference](r_create_model_use_cases.md#r_byom_create_model)
+ [Bring your own model (BYOM) - remote inference](r_create_model_use_cases.md#r_byom_create_model_remote)
+ [CREATE MODEL with K-MEANS](r_create_model_use_cases.md#r_k-means_create_model)
+ [Full CREATE MODEL](#r_full_create_model)

## Required privileges
<a name="r_simple_create_model-privileges"></a>

Following are required privileges for CREATE MODEL:
+ Superuser
+ Users with the CREATE MODEL privilege
+ Roles with the GRANT CREATE MODEL privilege

## Cost control
<a name="r_create_model_cost"></a>

 Amazon Redshift ML uses existing cluster resources to create prediction models, so you don’t have to pay additional costs. However, you might have additional costs if you need to resize your cluster or want to train your models. Amazon Redshift ML uses Amazon SageMaker AI to train models, which does have an additional associated cost. There are ways to control additional costs, such as limiting the maximum amount of time training can take or by limiting the number of training examples used to train your model. For more information, see [Costs for using Amazon Redshift ML](https://docs.aws.amazon.com/redshift/latest/dg/cost.html). 

## Full CREATE MODEL
<a name="r_full_create_model"></a>

The following summarizes the basic options of the full CREATE MODEL syntax.

### Full CREATE MODEL syntax
<a name="r_auto_off-create-model-synposis"></a>

The following is the full syntax of the CREATE MODEL statement.

**Important**  
When creating a model using the CREATE MODEL statement, follow the order of the keywords in the syntax following.

```
CREATE MODEL model_name
FROM { table_name | ( select_statement )  | 'job_name' }
[ TARGET column_name ]
FUNCTION function_name [ ( data_type [, ...] ) ] 
[ RETURNS data_type ] 
  -- supported only for BYOM
[ SAGEMAKER 'endpoint_name'[:'model_name']] 
  -- supported only for BYOM remote inference
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>' }
[ AUTO ON / OFF ]
  -- default is AUTO ON
[ MODEL_TYPE { XGBOOST | MLP | LINEAR_LEARNER | KMEANS | FORECAST } ]
  -- not required for non AUTO OFF case, default is the list of all supported types
  -- required for AUTO OFF
[ PROBLEM_TYPE ( REGRESSION | BINARY_CLASSIFICATION | MULTICLASS_CLASSIFICATION ) ]
  -- not supported when AUTO OFF
[ OBJECTIVE ( 'MSE' | 'Accuracy' | 'F1' | 'F1_Macro' | 'AUC' |
             'reg:squarederror' | 'reg:squaredlogerror'| 'reg:logistic'|
             'reg:pseudohubererror' | 'reg:tweedie' | 'binary:logistic' | 'binary:hinge',
             'multi:softmax' | 'RMSE' | 'WAPE' | 'MAPE' | 'MASE' | 'AverageWeightedQuantileLoss' ) ]
  -- for AUTO ON: first 5 are valid
  -- for AUTO OFF: 6-13 are valid
  -- for FORECAST: 14-18 are valid
[ PREPROCESSORS 'string' ]
  -- required for AUTO OFF, when it has to be 'none'
  -- optional for AUTO ON
[ HYPERPARAMETERS { DEFAULT | DEFAULT EXCEPT ( Key 'value' (,...) ) } ]
  -- support XGBoost hyperparameters, except OBJECTIVE
  -- required and only allowed for AUTO OFF
  -- default NUM_ROUND is 100
  -- NUM_CLASS is required if objective is multi:softmax (only possible for AUTO OFF)
 [ SETTINGS (
   S3_BUCKET 'amzn-s3-demo-bucket',  |
    -- required
  TAGS 'string', |
    -- optional
  KMS_KEY_ID 'kms_string', |
    -- optional
  S3_GARBAGE_COLLECT on / off, |
    -- optional, defualt is on.
  MAX_CELLS integer, |
    -- optional, default is 1,000,000
  MAX_RUNTIME integer (, ...) |
    -- optional, default is 5400 (1.5 hours)
  HORIZON integer, |
    -- required if creating a forecast model
  FREQUENCY integer, |
    -- required if creating a forecast model
  PERCENTILES string, |
    -- optional if creating a forecast model
  MAX_BATCH_ROWS integer -- optional for BYOM remote inference
    ) ]
```

## Parameters
<a name="r_create_model_parameters"></a>

model\$1name  
The name of the model. The model name in a schema must be unique.

FROM \$1 *table\$1name* \$1 ( *select\$1query* ) \$1 *'job\$1name'*\$1  
The table\$1name or the query that specifies the training data. They can either be an existing table in the system, or an Amazon Redshift-compatible SELECT query enclosed with parentheses, that is (). There must be at least two columns in the query result. 

TARGET *column\$1name*  
The name of the column that becomes the prediction target. The column must exist in the FROM clause. 

FUNCTION *function\$1name* ( *data\$1type* [, ...] )  
The name of the function to be created and the data types of the input arguments. You can provide the schema name of a schema in your database instead of a function name.

RETURNS *data\$1type*  
The data type to be returned from the model's function. The returned `SUPER` data type is applicable only to BYOM with remote inference.

SAGEMAKER '*endpoint\$1name*'[:'*model\$1name*']  
The name of the Amazon SageMaker AI endpoint. If the endpoint name points to a multimodel endpoint, add the name of the model to use. The endpoint must be hosted in the same AWS Region as the Amazon Redshift cluster.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>' \$1  
 Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE MODEL command runs. Alternatively, you can specify an ARN of an IAM role to use that role.

[ AUTO ON / OFF ]  
 Turns on or off CREATE MODEL automatic discovery of preprocessor, algorithm, and hyper-parameters selection. Specifying on when creating a Forecast model indicates to use an AutoPredictor, where Amazon Forecast applies the optimal combinations of algorithms to each time series in your dataset. 

 *MODEL\$1TYPE \$1 XGBOOST \$1 MLP \$1 LINEAR\$1LEARNER \$1 KMEANS \$1 FORECAST \$1*   
(Optional) Specifies the model type. You can specify if you want to train a model of a specific model type, such as XGBoost, multilayer perceptron (MLP), KMEANS, or Linear Learner, which are all algorithms that Amazon SageMaker AI Autopilot supports. If you don't specify the parameter, then all supported model types are searched during training for the best model. You can also create a forecast model in Redshift ML to create accurate time-series forecasts.

 *PROBLEM\$1TYPE ( REGRESSION \$1 BINARY\$1CLASSIFICATION \$1 MULTICLASS\$1CLASSIFICATION )*   
(Optional) Specifies the problem type. If you know the problem type, you can restrict Amazon Redshift to only search of the best model of that specific model type. If you don't specify this parameter, a problem type is discovered during the training, based on your data.

OBJECTIVE ( 'MSE' \$1 'Accuracy' \$1 'F1' \$1 'F1Macro' \$1 'AUC' \$1 'reg:squarederror' \$1 'reg:squaredlogerror' \$1 'reg:logistic' \$1 'reg:pseudohubererror' \$1 'reg:tweedie' \$1 'binary:logistic' \$1 'binary:hinge' \$1 'multi:softmax' \$1 'RMSE' \$1 'WAPE' \$1 'MAPE' \$1 'MASE' \$1 'AverageWeightedQuantileLoss' )  
(Optional) Specifies the name of the objective metric used to measure the predictive quality of a machine learning system. This metric is optimized during training to provide the best estimate for model parameter values from data. If you don't specify a metric explicitly, the default behavior is to automatically use MSE: for regression, F1: for binary classification, Accuracy: for multiclass classification. For more information about objectives, see [AutoMLJobObjective](https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AutoMLJobObjective.html) in the *Amazon SageMaker AI API Reference* and [Learning task parameters](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters)in the XGBOOST documentation. The values RMSE, WAPE, MAPE, MASE, and AverageWeightedQuantileLoss are only applicable to Forecast models. For more information, see the [CreateAutoPredictor](https://docs.aws.amazon.com/forecast/latest/dg/API_CreateAutoPredictor.html#forecast-CreateAutoPredictor-request-OptimizationMetric) API operation.

 *PREPROCESSORS 'string' *   
(Optional) Specifies certain combinations of preprocessors to certain sets of columns. The format is a list of columnSets, and the appropriate transforms to be applied to each set of columns. Amazon Redshift applies all the transformers in a specific transformers list to all columns in the corresponding ColumnSet. For example, to apply OneHotEncoder with Imputer to columns t1 and t2, use the sample command following.  

```
CREATE MODEL customer_churn
FROM customer_data
TARGET 'Churn'
FUNCTION predict_churn
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>' }
PROBLEM_TYPE BINARY_CLASSIFICATION
OBJECTIVE 'F1'
PREPROCESSORS '[
...
  {"ColumnSet": [
      "t1",
      "t2"
    ],
    "Transformers": [
      "OneHotEncoder",
      "Imputer"
    ]
  },
  {"ColumnSet": [
      "t3"
    ],
    "Transformers": [
      "OneHotEncoder"
    ]
  },
  {"ColumnSet": [
      "temp"
    ],
    "Transformers": [
      "Imputer",
      "NumericPassthrough"
    ]
  }
]'
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket'
)
```

HYPERPARAMETERS \$1 DEFAULT \$1 DEFAULT EXCEPT ( key ‘value’ (,..) ) \$1  
Specifies whether the default XGBoost parameters are used or overridden by user-specified values. The values must be enclosed with single quotes. Following are examples of parameters for XGBoost and their defaults.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_MODEL.html)

SETTINGS ( S3\$1BUCKET *'amzn-s3-demo-bucket'*, \$1 TAGS 'string', \$1 KMS\$1KEY\$1ID *'kms\$1string' *, \$1 S3\$1GARBAGE\$1COLLECT on / off, \$1 MAX\$1CELLS integer , \$1 MAX\$1RUNTIME (,...) , \$1 HORIZON integer, \$1 FREQUENCY forecast\$1frequency, \$1 PERCENTILES array of strings )  
S3\$1BUCKET clause specifies the Amazon S3 location that is used to store intermediate results.  
(Optional) The TAGS parameter is a comma-separated list of key-value pairs that you can use to tag resources created in Amazon SageMaker AI; and Amazon Forecast. Tags help you organize resources and allocate costs. Values in the pair are optional, so you can create tags by using the format `key=value` or just by creating a key. For more information about tags in Amazon Redshift, see [ Tagging overview](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-tagging.html).  
(Optional) KMS\$1KEY\$1ID specifies if Amazon Redshift uses server-side encryption with an AWS KMS key to protect data at rest. Data in transit is protected with Secure Sockets Layer (SSL).   
(Optional) S3\$1GARBAGE\$1COLLECT \$1 ON \$1 OFF \$1 specifies whether Amazon Redshift performs garbage collection on the resulting datasets used to train models and the models. If set to OFF, the resulting datasets used to train models and the models remains in Amazon S3 and can be used for other purposes. If set to ON, Amazon Redshift deletes the artifacts in Amazon S3 after the training completes. The default is ON.  
(Optional) MAX\$1CELLS specifies the number of cells in the training data. This value is the product of the number of records (in the training query or table) times the number of columns. The default is 1,000,000.  
(Optional) MAX\$1RUNTIME specifies the maximum amount of time to train. Training jobs often complete sooner depending on dataset size. This specifies the maximum amount of time the training should take. The default is 5,400 (90 minutes).  
HORIZON specifies the maximum number of predictions the forecast model can return. Once the model is trained, you can't change this integer. This parameter is required if training a forecast model.  
FREQUENCY specifies how granular in time units you want the forecasts to be. Available options are `Y | M | W | D | H | 30min | 15min | 10min | 5min | 1min`. This parameter is required if training a forecast model.  
(Optional) PERCENTILES is a comma-delimited string that specifies the forecast types used to train a predictor. Forecast types can be quantiles from 0.01 to 0.99, in increments of 0.01 or higher. You can also specify the mean forecast with mean. You can specify a maximum of five forecast types.

 MAX\$1BATCH\$1ROWS *integer*   
(Optional) The maximum number of rows that Amazon Redshift sends in a single batch request for a single SageMaker AI invocation. It is supported only for BYOM with remote inference. This parameter's minimum value is 1. The maximum value is `INT_MAX`, or 2,147,483,647. This parameter is required only when both input and returned data types are *SUPER*. The default value is `INT_MAX`, or 2,147,483,647. 

# Usage notes
<a name="r_create_model_usage_notes"></a>

When using CREATE MODEL, consider the following:
+ The CREATE MODEL statement operates in an asynchronous mode and returns upon the export of training data to Amazon S3. The remaining steps of training in Amazon SageMaker AI occur in the background. While training is in progress, the corresponding inference function is visible but can't be run. You can query [STV\$1ML\$1MODEL\$1INFO](r_STV_ML_MODEL_INFO.md) to see the state of training. 
+ The training can run for up to 90 minutes in the background, by default in the Auto model and can be extended. To cancel the training, simply run the [DROP MODEL](r_DROP_MODEL.md) command.
+ The Amazon Redshift cluster that you use to create the model and the Amazon S3 bucket that is used to stage the training data and model artifacts must be in the same AWS Region.
+ During the model training, Amazon Redshift and SageMaker AI store intermediate artifacts in the Amazon S3 bucket that you provide. By default, Amazon Redshift performs garbage collection at the end of the CREATE MODEL operation. Amazon Redshift removes those objects from Amazon S3. To retain those artifacts on Amazon S3, set the S3\$1GARBAGE COLLECT OFF option.
+ You must use at least 500 rows in the training data provided in the FROM clause.
+ You can only specify up to 256 feature (input) columns in the FROM \$1 table\$1name \$1 ( select\$1query ) \$1 clause when using the CREATE MODEL statement.
+ For AUTO ON, the column types that you can use as the training set are SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE, BOOLEAN, CHAR, VARCHAR, DATE, TIME, TIMETZ, TIMESTAMP, and TIMESTAMPTZ. For AUTO OFF, the column types that you can use as the training set are SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE, and BOOLEAN.
+ You can't use DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ, GEOMETRY, GEOGRAPHY, HLLSKETCH, SUPER, or VARBYTE as the target column type.
+ To improve model accuracy, do one of the following:
  + Add as many relevant columns in the CREATE MODEL command as possible when you specify the training data in the FROM clause.
  + Use a larger value for MAX\$1RUNTIME and MAX\$1CELLS. Larger values for this parameter increase the cost of training a model.
+ The CREATE MODEL statement execution returns as soon as the training data is computed and exported to the Amazon S3 bucket. After that point, you can check the status of the training using the SHOW MODEL command. When a model being trained in the background fails, you can check the error using SHOW MODEL. You can't retry a failed model. Use DROP MODEL to remove a failed model and recreate a new model. For more information about SHOW MODEL, see [SHOW MODEL](r_SHOW_MODEL.md).
+ Local BYOM supports the same kind of models that Amazon Redshift ML supports for non-BYOM cases. Amazon Redshift supports plain XGBoost (using XGBoost version 1.0 or later), KMEANS models without preprocessors, and XGBOOST/MLP/Linear Learner models trained by trained by Amazon SageMaker AI Autopilot. It supports the latter with preprocessors that Autopilot has specified that are also supported by Amazon SageMaker AI Neo.
+ If your Amazon Redshift cluster has enhanced routing enabled for your virtual private cloud (VPC), make sure to create an Amazon S3 VPC endpoint and an SageMaker AI VPC endpoint for the VPC that your cluster is in. Doing this enables the traffic to run through your VPC between these services during CREATE MODEL. For more information, see [SageMaker AI Clarify Job Amazon VPC Subnets and Security Groups](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-vpc.html#clarify-vpc-job).

# Use cases
<a name="r_create_model_use_cases"></a>

The following use cases demonstrate how to use CREATE MODEL to suit your needs.

## Simple CREATE MODEL
<a name="r_simple_create_model"></a>

The following summarizes the basic options of the CREATE MODEL syntax.

### Simple CREATE MODEL syntax
<a name="r_simple-create-model-synposis"></a>

```
CREATE MODEL model_name
FROM { table_name | ( select_query ) }
TARGET column_name
FUNCTION prediction_function_name
IAM_ROLE { default }
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket',
  [ MAX_CELLS integer ]
)
```

### Simple CREATE MODEL parameters
<a name="r_simple-create-model-parameters"></a>

 *model\$1name*   
The name of the model. The model name in a schema must be unique.

FROM \$1 *table\$1name* \$1 ( *select\$1query* ) \$1  
The table\$1name or the query that specifies the training data. They can either be an existing table in the system, or an Amazon Redshift-compatible SELECT query enclosed with parentheses, that is (). There must be at least two columns in the query result. 

TARGET *column\$1name*  
The name of the column that becomes the prediction target. The column must exist in the FROM clause. 

FUNCTION *prediction\$1function\$1name*   
A value that specifies the name of the Amazon Redshift machine learning function to be generated by the CREATE MODEL and used to make predictions using this model. The function is created in the same schema as the model object and can be overloaded.  
Amazon Redshift machine learning supports models, such as Xtreme Gradient Boosted tree (XGBoost) models for regression and classification.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>' \$1  
 Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREAT MODEL command runs. Alternatively, you can specify the ARN of an IAM role to use that role.

 *S3\$1BUCKET *'amzn-s3-demo-bucket'**   
The name of the Amazon S3 bucket that you previously created used to share training data and artifacts between Amazon Redshift and SageMaker AI. Amazon Redshift creates a subfolder in this bucket prior to unload of the training data. When training is complete, Amazon Redshift deletes the created subfolder and its contents. 

MAX\$1CELLS integer   
The maximum number of cells to export from the FROM clause. The default is 1,000,000.   
The number of cells is the product of the number of rows in the training data (produced by the FROM clause table or query) times the number of columns. If the number of cells in the training data are more than that specified by the max\$1cells parameter, CREATE MODEL downsamples the FROM clause training data to reduce the size of the training set below MAX\$1CELLS. Allowing larger training datasets can produce higher accuracy but also can mean the model takes longer to train and costs more.  
For information about costs of using Amazon Redshift, see [Costs for using Amazon Redshift ML](cost.md).  
For more information about costs associated with various cell numbers and free trial details, see [Amazon Redshift pricing](https://aws.amazon.com/redshift/pricing).

## CREATE MODEL with user guidance
<a name="r_user_guidance_create_model"></a>

Following, you can find a description of options for CREATE MODEL in addition to the options described in [Simple CREATE MODEL](#r_simple_create_model).

By default, CREATE MODEL searches for the best combination of preprocessing and model for your specific dataset. You might want additional control or introduce additional domain knowledge (such as problem type or objective) over your model. In a customer churn scenario, if the outcome “customer is not active” is rare, then the F1 objective is often preferred to the accuracy objective. Because high accuracy models might predict “customer is active” all the time, this results in high accuracy but little business value. For information about F1 objective, see [AutoMLJobObjective](https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AutoMLJobObjective.html) in the *Amazon SageMaker AI API Reference*.

Then the CREATE MODEL follows your suggestions on the specified aspects, such as the objective. At the same time, the CREATE MODEL automatically discovers the best preprocessors and the best hyperparameters. 

### CREATE MODEL with user guidance syntax
<a name="r_user_guidance-create-model-synposis"></a>

CREATE MODEL offers more flexibility on the aspects that you can specify and the aspects that Amazon Redshift automatically discovers.

```
CREATE MODEL model_name
FROM { table_name | ( select_statement ) }
TARGET column_name
FUNCTION function_name
IAM_ROLE { default }
[ MODEL_TYPE { XGBOOST | MLP | LINEAR_LEARNER} ]
[ PROBLEM_TYPE ( REGRESSION | BINARY_CLASSIFICATION | MULTICLASS_CLASSIFICATION ) ]
[ OBJECTIVE ( 'MSE' | 'Accuracy' | 'F1' | 'F1Macro' | 'AUC') ]
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket', |
  S3_GARBAGE_COLLECT { ON | OFF }, |
  KMS_KEY_ID 'kms_key_id', |
  MAX_CELLS integer, |
  MAX_RUNTIME integer (, ...)
)
```

### CREATE MODEL with user guidance parameters
<a name="r_user_guidance-create-model-parameters"></a>

 *MODEL\$1TYPE \$1 XGBOOST \$1 MLP \$1 LINEAR\$1LEARNER \$1*   
(Optional) Specifies the model type. You can specify if you want to train a model of a specific model type, such as XGBoost, multilayer perceptron (MLP), or Linear Learner, which are all algorithms that Amazon SageMaker AI Autopilot supports. If you don't specify the parameter, then all supported model types are searched during training for the best model.

 *PROBLEM\$1TYPE ( REGRESSION \$1 BINARY\$1CLASSIFICATION \$1 MULTICLASS\$1CLASSIFICATION )*   
(Optional) Specifies the problem type. If you know the problem type, you can restrict Amazon Redshift to only search of the best model of that specific model type. If you don't specify this parameter, a problem type is discovered during the training, based on your data.

OBJECTIVE ( 'MSE' \$1 'Accuracy' \$1 'F1' \$1 'F1Macro' \$1 'AUC')  
(Optional) Specifies the name of the objective metric used to measure the predictive quality of a machine learning system. This metric is optimized during training to provide the best estimate for model parameter values from data. If you don't specify a metric explicitly, the default behavior is to automatically use MSE: for regression, F1: for binary classification, Accuracy: for multiclass classification. For more information about objectives, see [AutoMLJobObjective](https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AutoMLJobObjective.html) in the *Amazon SageMaker AI API Reference*.

MAX\$1CELLS integer   
(Optional) Specifies the number of cells in the training data. This value is the product of the number of records (in the training query or table) times the number of columns. The default is 1,000,000.

MAX\$1RUNTIME integer   
(Optional) Specifies the maximum amount of time to train. Training jobs often complete sooner depending on dataset size. This specifies the maximum amount of time the training should take. The default is 5,400 (90 minutes).

S3\$1GARBAGE\$1COLLECT \$1 ON \$1 OFF \$1  
(Optional) Specifies whether Amazon Redshift performs garbage collection on the resulting datasets used to train models and the models. If set to OFF, the resulting datasets used to train models and the models remains in Amazon S3 and can be used for other purposes. If set to ON, Amazon Redshift deletes the artifacts in Amazon S3 after the training completes. The default is ON.

KMS\$1KEY\$1ID 'kms\$1key\$1id'  
(Optional) Specifies if Amazon Redshift uses server-side encryption with an AWS KMS key to protect data at rest. Data in transit is protected with Secure Sockets Layer (SSL). 

 *PREPROCESSORS 'string' *   
(Optional) Specifies certain combinations of preprocessors to certain sets of columns. The format is a list of columnSets, and the appropriate transforms to be applied to each set of columns. Amazon Redshift applies all the transformers in a specific transformers list to all columns in the corresponding ColumnSet. For example, to apply OneHotEncoder with Imputer to columns t1 and t2, use the sample command following.  

```
CREATE MODEL customer_churn
FROM customer_data
TARGET 'Churn'
FUNCTION predict_churn
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>' }
PROBLEM_TYPE BINARY_CLASSIFICATION
OBJECTIVE 'F1'
PREPROCESSORS '[
...
{"ColumnSet": [
    "t1",
    "t2"
  ],
  "Transformers": [
    "OneHotEncoder",
    "Imputer"
  ]
},
{"ColumnSet": [
    "t3"
  ],
  "Transformers": [
    "OneHotEncoder"
  ]
},
{"ColumnSet": [
    "temp"
  ],
  "Transformers": [
    "Imputer",
    "NumericPassthrough"
  ]
}
]'
SETTINGS (
S3_BUCKET 'amzn-s3-demo-bucket'
)
```

Amazon Redshift supports the following transformers:
+ OneHotEncoder – Typically used to encode a discrete value into a binary vector with one nonzero value. This transformer is suitable for many machine learning models. 
+ OrdinalEncoder – Encodes discrete values into a single integer. This transformer is suitable for certain machine learning models, such as MLP and Linear Learner. 
+ NumericPassthrough – Passes input as is into the model.
+ Imputer – Fills in missing values and not a number (NaN) values.
+ ImputerWithIndicator – Fills in missing values and NaN values. This transformer also creates an indicator of whether any values were missing and filled in.
+ Normalizer – Normalizes values, which can improve the performance of many machine learning algorithms.
+ DateTimeVectorizer – Creates a vector embedding, representing a column of datetime data type that can be used in machine learning models.
+ PCA – Projects the data into a lower dimensional space to reduce the number of features while keeping as much information as possible.
+ StandardScaler – Standardizes features by removing the mean and scaling to unit variance. 
+ MinMax – Transforms features by scaling each feature to a given range.

Amazon Redshift ML stores the trained transformers, and automatically applies them as part of the prediction query. You don't need to specify them when generating predictions from your model. 

## CREATE XGBoost models with AUTO OFF
<a name="r_auto_off_create_model"></a>

The AUTO OFF CREATE MODEL has generally different objectives from the default CREATE MODEL.

As an advanced user who already knows the model type that you want and hyperparameters to use when training these models, you can use CREATE MODEL with AUTO OFF to turn off the CREATE MODEL automatic discovery of preprocessors and hyperparameters. To do so, you explicitly specify the model type. XGBoost is currently the only model type supported when AUTO is set to OFF. You can specify hyperparameters. Amazon Redshift uses default values for any hyperparameters that you specified. 

### CREATE XGBoost models with AUTO OFF syntax
<a name="r_auto_off-create-model-synposis"></a>

```
CREATE MODEL model_name
FROM { table_name | (select_statement ) }
TARGET column_name
FUNCTION function_name
IAM_ROLE { default }
AUTO OFF
MODEL_TYPE XGBOOST
OBJECTIVE { 'reg:squarederror' | 'reg:squaredlogerror' | 'reg:logistic' |
            'reg:pseudohubererror' | 'reg:tweedie' | 'binary:logistic' | 'binary:hinge' |
            'multi:softmax' | 'rank:pairwise' | 'rank:ndcg' }
HYPERPARAMETERS DEFAULT EXCEPT (
    NUM_ROUND '10',
    ETA '0.2',
    NUM_CLASS '10',
    (, ...)
)
PREPROCESSORS 'none'
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket', |
  S3_GARBAGE_COLLECT { ON | OFF }, |
  KMS_KEY_ID 'kms_key_id', |
  MAX_CELLS integer, |
  MAX_RUNTIME integer (, ...)
)
```

### CREATE XGBoost models with AUTO OFF parameters
<a name="r_auto_off-create-model-parameters"></a>

 *AUTO OFF*   
Turns off CREATE MODEL automatic discovery of preprocessor, algorithm, and hyper-parameters selection.

MODEL\$1TYPE XGBOOST  
Specifies to use XGBOOST to train the model. 

OBJECTIVE str  
Specifies an objective recognized by the algorithm. Amazon Redshift supports reg:squarederror, reg:squaredlogerror, reg:logistic, reg:pseudohubererror, reg:tweedie, binary:logistic, binary:hinge, multi:softmax. For more information about these objectives, see [Learning task parameters](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters) in the XGBoost documentation.

HYPERPARAMETERS \$1 DEFAULT \$1 DEFAULT EXCEPT ( key ‘value’ (,..) ) \$1  
Specifies whether the default XGBoost parameters are used or overridden by user-specified values. The values must be enclosed with single quotes. Following are examples of parameters for XGBoost and their defaults.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_create_model_use_cases.html)

The following example prepares data for XGBoost.

```
DROP TABLE IF EXISTS abalone_xgb;

CREATE TABLE abalone_xgb (
length_val float,
diameter float,
height float,
whole_weight float,
shucked_weight float,
viscera_weight float,
shell_weight float,
rings int,
record_number int);

COPY abalone_xgb
FROM 's3://redshift-downloads/redshift-ml/abalone_xg/'
REGION 'us-east-1'
IAM_ROLE default
IGNOREHEADER 1 CSV;
```

The following example creates an XGBoost model with specified advanced options, such as MODEL\$1TYPE, OBJECTIVE, and PREPROCESSORS.

```
DROP MODEL abalone_xgboost_multi_predict_age;

CREATE MODEL abalone_xgboost_multi_predict_age
FROM ( SELECT length_val,
              diameter,
              height,
              whole_weight,
              shucked_weight,
              viscera_weight,
              shell_weight,
              rings
   FROM abalone_xgb WHERE record_number < 2500 )
TARGET rings FUNCTION ml_fn_abalone_xgboost_multi_predict_age
IAM_ROLE default
AUTO OFF
MODEL_TYPE XGBOOST
OBJECTIVE 'multi:softmax'
PREPROCESSORS 'none'
HYPERPARAMETERS DEFAULT EXCEPT (NUM_ROUND '100', NUM_CLASS '30')
SETTINGS (S3_BUCKET 'amzn-s3-demo-bucket');
```

The following example uses an inference query to predict the age of the fish with a record number greater than 2500. It uses the function ml\$1fn\$1abalone\$1xgboost\$1multi\$1predict\$1age created from the above command. 

```
select ml_fn_abalone_xgboost_multi_predict_age(length_val,
                                                   diameter,
                                                   height,
                                                   whole_weight,
                                                   shucked_weight,
                                                   viscera_weight,
                                                   shell_weight)+1.5 as age
from abalone_xgb where record_number > 2500;
```

## Bring your own model (BYOM) - local inference
<a name="r_byom_create_model"></a>

Amazon Redshift ML supports using bring your own model (BYOM) for local inference.

The following summarizes the options for the CREATE MODEL syntax for BYOM. You can use a model trained outside of Amazon Redshift with Amazon SageMaker AI for in-database inference locally in Amazon Redshift.

### CREATE MODEL syntax for local inference
<a name="r_local-create-model"></a>

The following describes the CREATE MODEL syntax for local inference.

```
CREATE MODEL model_name
FROM ('job_name' | 's3_path' )
FUNCTION function_name ( data_type [, ...] )
RETURNS data_type
IAM_ROLE { default }
[ SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket', | --required
  KMS_KEY_ID 'kms_string') --optional
];
```

Amazon Redshift currently only supports pretrained XGBoost, MLP, and Linear Learner models for BYOM. You can import SageMaker AI Autopilot and models directly trained in Amazon SageMaker AI for local inference using this path. 

#### CREATE MODEL parameters for local inference
<a name="r_local-create-model-parameters"></a>

 *model\$1name*   
The name of the model. The model name in a schema must be unique.

FROM (*'job\$1name'* \$1 *'s3\$1path'* )  
The *job\$1name* uses an Amazon SageMaker AI job name as the input. The job name can either be an Amazon SageMaker AI training job name or an Amazon SageMaker AI Autopilot job name. The job must be created in the same AWS account that owns the Amazon Redshift cluster. To find the job name, launch Amazon SageMaker AI. In the **Training** dropdown menu, choose **Training jobs**.  
The *'s3\$1path'* specifies the S3 location of the .tar.gz model artifacts file that is to be used when creating the model.

FUNCTION *function\$1name* ( *data\$1type* [, ...] )  
The name of the function to be created and the data types of the input arguments. You can provide a schema name.

RETURNS *data\$1type*  
The data type of the value returned by the function.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>'\$1  
 Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE MODEL command runs.  
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. 

SETTINGS ( S3\$1BUCKET *'amzn-s3-demo-bucket'*, \$1 KMS\$1KEY\$1ID *'kms\$1string'*)  
The S3\$1BUCKET clause specifies the Amazon S3 location that is used to store intermediate results.  
(Optional) The KMS\$1KEY\$1ID clause specifies if Amazon Redshift uses server-side encryption with an AWS KMS key to protect data at rest. Data in transit is protected with Secure Sockets Layer (SSL).  
For more information, see [CREATE MODEL with user guidance](#r_user_guidance_create_model).

#### CREATE MODEL for local inference example
<a name="r_local-create-model-example"></a>

The following example creates a model that has been previously trained in Amazon SageMaker AI, outside of Amazon Redshift. Because the model type is supported by Amazon Redshift ML for local inference, the following CREATE MODEL creates a function that can be used locally in Amazon Redshift. You can provide a SageMaker AI training job name.

```
CREATE MODEL customer_churn
FROM 'training-job-customer-churn-v4'
FUNCTION customer_churn_predict (varchar, int, float, float)
RETURNS int
IAM_ROLE default
SETTINGS (S3_BUCKET 'amzn-s3-demo-bucket');
```

After the model is created, you can use the function *customer\$1churn\$1predict* with the specified argument types to make predictions.

## Bring your own model (BYOM) - remote inference
<a name="r_byom_create_model_remote"></a>

Amazon Redshift ML also supports using bring your own model (BYOM) for remote inference.

The following summarizes the options for the CREATE MODEL syntax for BYOM.

### CREATE MODEL syntax for remote inference
<a name="r_remote-create-model"></a>

The following describes the CREATE MODEL syntax for remote inference.

```
CREATE MODEL model_name 
FUNCTION function_name ( data_type [, ...] )
RETURNS data_type
SAGEMAKER 'endpoint_name'[:'model_name']
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>' }
[SETTINGS (MAX_BATCH_ROWS integer)];
```

#### CREATE MODEL parameters for remote inference
<a name="r_remote-create-model-parameters"></a>

 *model\$1name*   
The name of the model. The model name in a schema must be unique.

FUNCTION *fn\$1name* ( [*data\$1type*] [, ...] )  
The name of the function and the data types of the input arguments. See [Data types](https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html) for all of the supported data types. `Geography`, `geometry`, and `hllsketch` aren't supported.   
You can also provide a function name inside a schema using two-part notation, such as `myschema.myfunction`.

RETURNS *data\$1type*  
The data type of the value returned by the function. See [Data types](https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html) for all of the supported data types. `Geography`, `geometry`, and `hllsketch` aren't supported. 

SAGEMAKER *'endpoint\$1name'*[:*'model\$1name'*]   
The name of the Amazon SageMaker AI endpoint. If the endpoint name points to a multimodel endpoint, add the name of the model to use. The endpoint must be hosted in the same AWS Region and AWS account as the Amazon Redshift cluster. To find your endpoint, launch Amazon SageMaker AI. In the **Inference** dropdown menu, choose **Endpoints**.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>'\$1  
 Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREATE MODEL command runs. Alternatively, you can specify the ARN of an IAM role to use that role.

MAX\$1BATCH\$1ROWS *integer*  
The maximum number of rows that Amazon Redshift sends in a single batch request for a single SageMaker AI invocation. It is supported only for BYOM with remote inference. The actual number of rows in a batch also depends on the input size, but is less than or equal to this value. This parameter's minimum value is 1. The maximum value is `INT_MAX`, or 2,147,483,647. This parameter is required only when both input and returned data types are `SUPER`. The default value is `INT_MAX`, or 2,147,483,647. 

When the model is deployed to a SageMaker AI endpoint, SageMaker AI creates the information of the model in Amazon Redshift. It then performs inference through the external function. You can use the SHOW MODEL command to view the model information on your Amazon Redshift cluster.

#### CREATE MODEL for remote inference usage notes
<a name="r_remote-create-model-usage-notes"></a>

Before using CREATE MODEL for remote inference, consider the following:
+ The endpoint must be hosted by the same AWS account that owns the Amazon Redshift cluster.
+ Make sure either that the Amazon SageMaker AI endpoint has enough resources to accommodate inference calls from Amazon Redshift or that the Amazon SageMaker AI endpoint can be automatically scaled.
+ If you're not using the `SUPER` data type as input, the model only accepts inputs in the format of comma-separated values (CSV) which corresponds to a content type of `text/CSV` in SageMaker AI.
+ If you're not using the `SUPER` data type as input, the output of models is a single value of the type specified when you create the function. The output is in the format of comma-separated values (CSV) through a content type of `text/CSV` in SageMaker AI. `VARCHAR` data types cannot be in quotes and cannot contain new lines, and each output must be in a new line.
+ Models accept nulls as empty strings.
+ When the input data type is `SUPER`, only one input argument is supported. 
+ When the input data type is `SUPER`, the returned data type must also be `SUPER`. 
+ MAX\$1BATCH\$1ROWS is required when both input and returned data types are SUPER. 
+ When the input data type is `SUPER` the content type of the endpoint invocation is either `application/json` when MAX\$1BATCH\$1ROWS is `1` or `application/jsonlines` in all other cases. 
+ When the returned data type is `SUPER` the accept type of the endpoint invocation is either `application/json` when MAX\$1BATCH\$1ROWS is `1` or `application/jsonlines` in all other cases. 

##### CREATE MODEL for remote inference example
<a name="r_remote-create-model-example"></a>

The following example creates a model that uses a SageMaker AI endpoint to make predictions. Make sure that the endpoint is running to make predictions and specify its name in the CREATE MODEL command.

```
CREATE MODEL remote_customer_churn
FUNCTION remote_fn_customer_churn_predict (varchar, int, float, float)
RETURNS int
SAGEMAKER 'customer-churn-endpoint'
IAM_ROLE default;
```

 The following example creates a BYOM with remote inference with a large language model model (LLM). LLMs hosted on Amazon SageMaker AI Jumpstart accept and return the `application/json` content type and they support a single JSON per invocation. The input and returned data types must be `SUPER` and MAX\$1BATCH\$1ROWS must be set to 1. 

```
CREATE MODEL sample_super_data_model
FUNCTION sample_super_data_model_predict(super)
RETURNS super
SAGEMAKER 'sample_super_data_model_endpoint'
IAM_ROLE default
SETTINGS (MAX_BATCH_ROWS 1);
```

## CREATE MODEL with K-MEANS
<a name="r_k-means_create_model"></a>

Amazon Redshift supports the K-Means algorithm that groups data that isn't labeled. This algorithm solves clustering problems where you want to discover groupings in the data. Unclassified data is grouped and partitioned based on its similarities and differences. 

### CREATE MODEL with K-MEANS syntax
<a name="r_k-means-create-model-synposis"></a>

```
CREATE MODEL model_name
FROM { table_name | ( select_statement ) }
FUNCTION function_name
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>' }
AUTO OFF
MODEL_TYPE KMEANS
PREPROCESSORS 'string'
HYPERPARAMETERS DEFAULT EXCEPT ( K 'val' [, ...] )
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket',
  KMS_KEY_ID 'kms_string', |
    -- optional
  S3_GARBAGE_COLLECT on / off, |
    -- optional
  MAX_CELLS integer, |
    -- optional
  MAX_RUNTIME integer
    -- optional);
```

### CREATE MODEL with K-MEANS parameters
<a name="r_k-means-create-model-parameters"></a>

 *AUTO OFF*   
Turns off CREATE MODEL automatic discovery of preprocessor, algorithm, and hyper-parameters selection.

MODEL\$1TYPE KMEANS  
Specifies to use KMEANS to train the model. 

PREPROCESSORS 'string'  
Specifies certain combinations of preprocessors to certain sets of columns. The format is a list of columnSets, and the appropriate transforms to be applied to each set of columns. Amazon Redshift supports 3 K-Means preprocessors, namely StandardScaler, MinMax, and NumericPassthrough. If you don't want to apply any preprocessing for K-Means, choose NumericPassthrough explicitly as a transformer. For more information about supported transformers, see [CREATE MODEL with user guidance parameters](#r_user_guidance-create-model-parameters).  
The K-Means algorithm uses Euclidean distance to calculate similarity. Preprocessing the data ensures that the features of the model stay on the same scale and produce reliable results.

HYPERPARAMETERS DEFAULT EXCEPT ( K 'val' [, ...] )  
Specifies whether the K-Means parameters are used. You must specify the `K` parameter when using the K-Means algorithm. For more information, see [K-Means Hyperparameters](https://docs.aws.amazon.com/sagemaker/latest/dg/k-means-api-config.html) in the *Amazon SageMaker AI Developer Guide*

The following example prepares data for K-Means.

```
CREATE MODEL customers_clusters
FROM customers
FUNCTION customers_cluster
IAM_ROLE default
AUTO OFF
MODEL_TYPE KMEANS
PREPROCESSORS '[
{
  "ColumnSet": [ "*" ],
  "Transformers": [ "NumericPassthrough" ]
}
]'
HYPERPARAMETERS DEFAULT EXCEPT ( K '5' )
SETTINGS (S3_BUCKET 'amzn-s3-demo-bucket');

select customer_id, customers_cluster(...) from customers;
customer_id | customers_cluster
--------------------
12345            1
12346            2
12347            4
12348
```

## CREATE MODEL with Forecast
<a name="r_forecast_model"></a>

Forecast models in Redshift ML use Amazon Forecast to create accurate time-series forecasts. Doing so lets you use historical data over a time period to make predictions about future events. Common use cases of Amazon Forecast include using retail product data to decide how to price inventory, manufacturing quantity data to predict how much of one item to order, and web traffic data to forecast how much traffic a web server might receive. 

 [Quota limits from Amazon Forecast](https://docs.aws.amazon.com/forecast/latest/dg/limits.html) are enforced in Amazon Redshift forecast models. For example, the maximum number of forecasts is 100, but it's adjustable. Dropping a forecast model doesn’t automatically delete the associated resources in Amazon Forecast. If you delete a Redshift cluster, all associated models are dropped as well. 

Note that Forecast models are currently only available in the following Regions:
+ US East (Ohio) (us-east-2)
+ US East (N. Virginia) (us-east-1)
+ US West (Oregon) (us-west-2)
+ Asia Pacific (Mumbai) (ap-south-1)
+ Asia Pacific (Seoul) (ap-northeast-2)
+ Asia Pacific (Singapore) (ap-southeast-1)
+ Asia Pacific (Sydney) (ap-southeast-2)
+ Asia Pacific (Tokyo) (ap-northeast-1)
+ Europe (Frankfurt) (eu-central-1)
+ Europe (Ireland) (eu-west-1)

### CREATE MODEL with Forecast syntax
<a name="r_forecast_model-synopsis"></a>

```
CREATE [ OR REPLACE ] MODEL forecast_model_name 
FROM { table_name | ( select_query ) } 
TARGET column_name
IAM_ROLE { default | 'arn:aws:iam::<account-id>:role/<role-name>'} 
AUTO ON
MODEL_TYPE FORECAST
SETTINGS (
  S3_BUCKET 'amzn-s3-demo-bucket',
  HORIZON integer,
  FREQUENCY forecast_frequency
  [PERCENTILES '0.1', '0.5', '0.9']
  )
```

### CREATE MODEL with Forecast parameters
<a name="r_forecast_model-parameters"></a>

 *forecast\$1model\$1name*   
The name of the model. The model name must be unique.

FROM \$1 table\$1name \$1 ( select\$1query ) \$1  
The table\$1name or the query that specifies the training data. This can either be an existing table in the system, or an Amazon Redshift compatible SELECT query enclosed with parentheses. The table or query result must have at least three columns: (1) a varchar column that specifies the name of the time-series. Each dataset can have multiple time-series; (2) a datetime column; and (3) the target column to predict. This target column must be either an int or a float. If you supply a dataset that has more than three columns, Amazon Redshift assumes that all additional columns are part of a related time series. Note that related time series must be of type int or float. For more information about related time series, see [Using Related Time Series Datasets](https://docs.aws.amazon.com/forecast/latest/dg/related-time-series-datasets.html).

TARGET column\$1name  
The name of the column that becomes the prediction target. The column must exist in the FROM clause.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::<account-id>:role/<role-name>' \$1  
Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the CREAT MODEL command runs. Alternatively, you can specify an ARN of an IAM role to use that role. 

AUTO ON  
Turns on the CREATE MODEL automatic discovery of algorithm and hyper-parameters selection. Specifying on when creating a Forecast model indicates to use a Forecast AutoPredictor, where Amazon Forecast applies the optimal combinations of algorithms to each time series in your dataset.

MODEL\$1TYPE FORECAST  
Specifies to use FORECAST to train the model.

S3\$1BUCKET 'amzn-s3-demo-bucket'  
The name of the Amazon Simple Storage Service bucket that you previously created and that’s used to share training data and artifacts between Amazon Redshift and Amazon Forecast. Amazon Redshift creates a subfolder in this bucket before unloading the training data. When training is complete, Amazon Redshift deletes the created subfolder and its contents.

HORIZON integer  
The maximum number of predictions the forecast model can return. Once the model is trained, you can't change this integer.

FREQUENCY forecast\$1frequency  
Specifies how granular you want the forecasts to be. Available options are `Y | M | W | D | H | 30min | 15min | 10min | 5min | 1min`. Required if you’re training a forecast model.

PERCENTILES string  
A comma-delimited string that specifies the forecast types used to train a predictor. Forecast types can be quantiles from 0.01 to 0.99, by increments of 0.01 or higher. You can also specify the mean forecast with mean. You can specify a maximum of five forecast types.

The following example demonstrates how to create a simple forecast model.

```
CREATE MODEL forecast_example
FROM forecast_electricity_
TARGET target 
IAM_ROLE 'arn:aws:iam::<account-id>:role/<role-name>'
AUTO ON 
MODEL_TYPE FORECAST
SETTINGS (S3_BUCKET 'amzn-s3-demo-bucket',
          HORIZON 24,
          FREQUENCY 'H',
          PERCENTILES '0.25,0.50,0.75,mean',
          S3_GARBAGE_COLLECT OFF);
```

After you create the forecast model, you can create a new table with the prediction data.

```
CREATE TABLE forecast_model_results as SELECT Forecast(forecast_example)
```

You can then query the new table to get predictions.

```
SELECT * FROM forecast_model_results
```

# CREATE PROCEDURE
<a name="r_CREATE_PROCEDURE"></a>

Creates a new stored procedure or replaces an existing procedure for the current database.

For more information and examples, see [Creating stored procedures in Amazon Redshift](stored-procedure-overview.md).

## Required privileges
<a name="r_CREATE_PROCEDURE-privileges"></a>

You must have permission by one of the following ways to run CREATE OR REPLACE PROCEDURE:
+ For CREATE PROCEDURE:
  + Superuser
  + Users with CREATE and USAGE privilege on the schema where the stored procedure is created
+ For REPLACE PROCEDURE:
  + Superuser
  + Procedure owner

## Syntax
<a name="r_CREATE_PROCEDURE-synopsis"></a>

```
CREATE [ OR REPLACE ] PROCEDURE sp_procedure_name  
  ( [ [ argname ] [ argmode ] argtype [, ...] ] )
[ NONATOMIC ]
AS $$
  procedure_body
$$ LANGUAGE plpgsql
[ { SECURITY INVOKER | SECURITY DEFINER } ]
[ SET configuration_parameter { TO value | = value } ]
```

## Parameters
<a name="r_CREATE_PROCEDURE-parameters"></a>

 OR REPLACE   
A clause that specifies that if a procedure with the same name and input argument data types, or signature, as this one already exists, the existing procedure is replaced. You can only replace a procedure with a new procedure that defines an identical set of data types.   
If you define a procedure with the same name as an existing procedure, but a different signature, you create a new procedure. In other words, the procedure name is overloaded. For more information, see [Overloading procedure names](stored-procedure-naming.md#stored-procedure-overloading-name). 

 *sp\$1procedure\$1name*   
The name of the procedure. If you specify a schema name (such as **myschema.myprocedure**), the procedure is created in the specified schema. Otherwise, the procedure is created in the current schema. For more information about valid names, see [Names and identifiers](r_names.md).   
We recommend that you prefix all stored procedure names with `sp_`. Amazon Redshift reserves the `sp_` prefix for stored procedure names. By using the `sp_` prefix, you ensure that your stored procedure name doesn't conflict with any existing or future Amazon Redshift built-in stored procedure or function names. For more information, see [Naming stored procedures](stored-procedure-naming.md).  
You can define more than one procedure with the same name if the data types for the input arguments, or signatures, are different. In other words, in this case the procedure name is overloaded. For more information, see [Overloading procedure names](stored-procedure-naming.md#stored-procedure-overloading-name)

*[argname] [ argmode] argtype*   
A list of argument names, argument modes, and data types. Only the data type is required. Name and mode are optional and their position can be switched.  
The argument mode can be IN, OUT, or INOUT. The default is IN.  
You can use OUT and INOUT arguments to return one or more values from a procedure call. When there are OUT or INOUT arguments, the procedure call returns one result row containing *n* columns, where *n* is the total number of OUT or INOUT arguments.  
INOUT arguments are input and output arguments at the same time. *Input arguments* include both IN and INOUT arguments, and *output arguments* include both OUT and INOUT arguments.  
OUT arguments aren't specified as part of the CALL statement. Specify INOUT arguments in the stored procedure CALL statement. INOUT arguments can be useful when passing and returning values from a nested call, and also when returning a `refcursor`. For more information on `refcursor` types, see [Cursors](c_PLpgSQL-statements.md#r_PLpgSQL-cursors).  
The argument data types can be any standard Amazon Redshift data type. In addition, an argument data type can be `refcursor`.  
You can specify a maximum of 32 input arguments and 32 output arguments. 

AS \$1\$1 *procedure\$1body* \$1\$1   
A construct that encloses the procedure to be run. The literal keywords AS \$1\$1 and \$1\$1 are required.  
Amazon Redshift requires you to enclose the statement in your procedure by using a format called dollar quoting. Anything within the enclosure is passed exactly as is. You don't need to escape any special characters because the contents of the string are written literally.  
With *dollar quoting, *you use a pair of dollar signs (\$1\$1) to signify the start and the end of the statement to run, as shown in the following example.  

```
$$ my statement $$
```
Optionally, between the dollar signs in each pair, you can specify a string to help identify the statement. The string that you use must be the same in both the start and the end of the enclosure pairs. This string is case-sensitive, and it follows the same constraints as an unquoted identifier except that it can't contain dollar signs. The following example uses the string test.  

```
$test$ my statement $test$
```
This syntax is also useful for nested dollar quoting. For more information about dollar quoting, see "Dollar-quoted String Constants" under [Lexical Structure](https://www.postgresql.org/docs/9.0/sql-syntax-lexical.html) in the PostgreSQL documentation.

 *procedure\$1body*   
A set of valid PL/pgSQL statements. PL/pgSQL statements augment SQL commands with procedural constructs, including looping and conditional expressions, to control logical flow. Most SQL commands can be used in the procedure body, including data modification language (DML) such as COPY, UNLOAD and INSERT, and data definition language (DDL) such as CREATE TABLE. For more information, see [PL/pgSQL language reference](c_pl_pgSQL_reference.md). 

LANGUAGE *plpgsql*  
A language value. Specify `plpgsql`. You must have permission for usage on language to use `plpgsql`. For more information, see [GRANT](r_GRANT.md). 

NONATOMIC  
Creates the stored procedure in a nonatomic transaction mode. NONATOMIC mode automatically commits the statements inside the procedure. Additionally, when an error occurs inside the NONATOMIC procedure, the error is not re-thrown if it is handled by an exception block. For more information, see [Managing transactions](stored-procedure-transaction-management.md) and [RAISE](c_PLpgSQL-statements.md#r_PLpgSQL-messages-errors).  
When you define a stored procedure as `NONATOMIC`, consider the following:  
+ When you nest stored procedure calls, all the procedures must be created in the same transaction mode.
+ The `SECURITY DEFINER` option and `SET configuration_parameter` option are not supported when creating a procedure in NONATOMIC mode.
+ Any cursor that is opened (explicitly or implicitly) is closed automatically when an implicit commit is processed. Therefore, you must open an explicit transaction before beginning a cursor loop to ensure that any SQL within the loop's iteration is not implicitly committed.

SECURITY INVOKER \$1 SECURITY DEFINER  
The `SECURITY DEFINER` option is not supported when `NONATOMIC` is specified.  
The security mode for the procedure determines the procedure's access privileges at runtime. The procedure must have permission to access the underlying database objects.   
For SECURITY INVOKER mode, the procedure uses the privileges of the user calling the procedure. The user must have explicit permissions on the underlying database objects. The default is SECURITY INVOKER.  
For SECURITY DEFINER mode, the procedure uses the privileges of the procedure owner. The procedure owner is defined as the user that owns the procedure at run time, not necessarily the user that initially defined the procedure. The user calling the procedure needs execute privilege on the procedure, but doesn't need any privileges on the underlying objects. 

SET configuration\$1parameter \$1 TO value \$1 = value \$1  
These options are not supported when `NONATOMIC` is specified.  
The SET clause causes the specified `configuration_parameter` to be set to the specified value when the procedure is entered. This clause then restores `configuration_parameter` to its earlier value when the procedure exits. 

## Usage notes
<a name="r_CREATE_PROCEDURE-usage"></a>

If a stored procedure was created using the SECURITY DEFINER option, when invoking the CURRENT\$1USER function from within the stored procedure, Amazon Redshift returns the user name of the owner of the stored procedure.

## Examples
<a name="r_CREATE_PROCEDURE-examples"></a>

**Note**  
If when running these examples you encounter an error similar to:  

```
ERROR: 42601: [Amazon](500310) unterminated dollar-quoted string at or near "$$
```
See [Overview of stored procedures in Amazon Redshift](stored-procedure-create.md). 

The following example creates a procedure with two input parameters.

```
CREATE OR REPLACE PROCEDURE test_sp1(f1 int, f2 varchar(20))
AS $$
DECLARE
  min_val int;
BEGIN
  DROP TABLE IF EXISTS tmp_tbl;
  CREATE TEMP TABLE tmp_tbl(id int);
  INSERT INTO tmp_tbl values (f1),(10001),(10002);
  SELECT min_val MIN(id) FROM tmp_tbl;
  RAISE INFO 'min_val = %, f2 = %', min_val, f2;
END;
$$ LANGUAGE plpgsql;
```

**Note**  
 When you write stored procedures, we recommend a best practice for securing sensitive values:   
 Don't hard code any sensitive information in stored procedure logic. For example, don't assign a user password in a CREATE USER statement in the body of a stored procedure. This poses a security risk, because hard-coded values can be recorded as schema metadata in catalog tables. Instead, pass sensitive values, such as passwords, as arguments to the stored procedure, by means of parameters.   
For more information about stored procedures, see [CREATE PROCEDURE](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_PROCEDURE.html) and [Creating stored procedures in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-overview.html). For more information about catalog tables, see [System catalog tables](https://docs.aws.amazon.com/redshift/latest/dg/c_intro_catalog_views.html).

The following example creates a procedure with one IN parameter, one OUT parameter, and one INOUT parameter.

```
CREATE OR REPLACE PROCEDURE test_sp2(f1 IN int, f2 INOUT varchar(256), out_var OUT varchar(256))
AS $$
DECLARE
  loop_var int;
BEGIN
  IF f1 is null OR f2 is null THEN
    RAISE EXCEPTION 'input cannot be null';
  END IF;
  DROP TABLE if exists my_etl;
  CREATE TEMP TABLE my_etl(a int, b varchar);
    FOR loop_var IN 1..f1 LOOP
        insert into my_etl values (loop_var, f2);
        f2 := f2 || '+' || f2;
    END LOOP;
  SELECT INTO out_var count(*) from my_etl;
END;
$$ LANGUAGE plpgsql;
```

The following example creates a procedure that uses the `SECURITY DEFINER` parameter. This procedure runs using the privileges of the user who owns the procedure.

```
CREATE OR REPLACE PROCEDURE sp_get_current_user_definer()
AS $$
DECLARE curr_user varchar(250);
BEGIN
  SELECT current_user INTO curr_user;
  RAISE INFO '%', curr_user;
END;
$$ LANGUAGE plpgsql
SECURITY DEFINER;
```

The following example creates a procedure that uses the `SECURITY INVOKER` parameter. This procedure runs using the privileges of the user who runs the procedure.

```
CREATE OR REPLACE PROCEDURE sp_get_current_user_invoker()
AS $$
DECLARE curr_user varchar(250);
BEGIN
  SELECT current_user INTO curr_user;
  RAISE INFO '%', curr_user;
END;
$$ LANGUAGE plpgsql
SECURITY INVOKER;
```

# CREATE RLS POLICY
<a name="r_CREATE_RLS_POLICY"></a>

Creates a new row-level security policy to provide granular access to database objects.

Superusers and users or roles that have the sys:secadmin role can create a policy.

## Syntax
<a name="r_CREATE_RLS_POLICY-synopsis"></a>

```
CREATE RLS POLICY { policy_name | database_name.policy_name }
[ WITH (column_name data_type [, ...]) [ [AS] relation_alias ] ]
USING ( using_predicate_exp )
```

## Parameters
<a name="r_CREATE_RLS_POLICY-parameters"></a>

 *policy\$1name*   
The name of the policy.

database\$1name  
The database name of where the policy will be created. Policy can be created on the connected database or on a database that supports Amazon Redshift federated permissions.

WITH (*column\$1name data\$1type [, ...]*)   
Specifies the *column\$1name* and *data\$1type* referenced to the columns of tables to which the policy is attached.   
You can omit the WITH clause only when the RLS policy doesn't reference any columns of tables to which the policy is attached.

AS *relation\$1alias*  
Specifies an optional alias for the table that the RLS policy will be attached to.

USING (* using\$1predicate\$1exp *)  
Specifies a filter that is applied to the WHERE clause of the query. Amazon Redshift applies a policy predicate before the query-level user predicates. For example, **current\$1user = ‘joe’ and price > 10** limits Joe to see only records with the price greater than \$110.

For the usage of CREATE RLS POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

## Usage notes
<a name="r_CREATE_RLS_POLICY-usage"></a>

When working with the CREATE RLS POLICY statement, observe the following:
+ Amazon Redshift supports filters that can be part of a WHERE clause of a query.
+ All policies being attached to a table must have been created with the same table alias.
+ You must use the GRANT and REVOKE statements to explicitly grant and revoke SELECT permissions to RLS policies that reference lookup tables. A lookup table is a table object used inside a policy definition. For more information, see [GRANT](r_GRANT.md) and [REVOKE](r_REVOKE.md). 
+ Amazon Redshift row-level security doesn't support the following object types inside a policy definition: catalog tables, cross-database relations, external tables, regular views, late-binding views, tables with RLS policies turned on, and temporary tables.

## Examples
<a name="r_CREATE_RLS_POLICY-examples"></a>

The following example creates an RLS policy called policy\$1concerts. This policy applies to a VARCHAR(10) column called catgroup and and sets the USING filter to only return rows where the value of catgroup is `'Concerts'`.

```
CREATE RLS POLICY policy_concerts
WITH (catgroup VARCHAR(10))
USING (catgroup = 'Concerts');
```

For an end-to-end example of using RLS policies, see [Row-level security end-to-end example](t_rls-example.md).

# CREATE ROLE
<a name="r_CREATE_ROLE"></a>

Creates a new custom role that is a collection of permissions. For a list of Amazon Redshift system-defined roles, see [Amazon Redshift system-defined roles](r_roles-default.md). Query [SVV\$1ROLES](r_SVV_ROLES.md) to view the currently created roles in your cluster or workgroup.

There is a quota of the number of roles that can be created. For more information, see [Quotas and limits in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.

## Required permissions
<a name="r_CREATE_ROLE-privileges"></a>

Following are the required privileges for CREATE ROLE.
+ Superuser
+ Users with the CREATE ROLE privilege

## Syntax
<a name="r_CREATE_ROLE-synopsis"></a>

```
CREATE ROLE role_name
[ EXTERNALID external_id ]
```

## Parameters
<a name="r_CREATE_ROLE-parameters"></a>

*role\$1name*  
The name of the role. The role name must be unique and can't be the same as any user names. A role name can't be a reserved word.  
A superuser or regular user with the CREATE ROLE privilege can create roles. A user that is not a superuser but that has been granted USAGE to the role WITH GRANT OPTION and ALTER privilege can grant this role to anyone.

EXTERNALID *external\$1id*  
The identifier for the role, which is associated with an identity provider. For more information, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html).

## Examples
<a name="r_CREATE_ROLE-examples"></a>

The following example creates a role `sample_role1`.

```
CREATE ROLE sample_role1;
```

The following example creates a role `sample_role1`, with an external ID that is associated with an identity provider.

```
CREATE ROLE sample_role1 EXTERNALID "ABC123";
```

# CREATE SCHEMA
<a name="r_CREATE_SCHEMA"></a>

Defines a new schema for the current database.

## Required privileges
<a name="r_CREATE_SCHEMA-privileges"></a>

Following are required privileges for CREATE SCHEMA:
+ Superuser
+ Users with the CREATE SCHEMA privilege

## Syntax
<a name="r_CREATE_SCHEMA-synopsis"></a>

```
CREATE SCHEMA [ IF NOT EXISTS ] schema_name [ AUTHORIZATION username ]
           [ QUOTA {quota [MB | GB | TB] | UNLIMITED} ] [ schema_element [ ... ]

CREATE SCHEMA AUTHORIZATION username[ QUOTA {quota [MB | GB | TB] | UNLIMITED} ] [ schema_element [ ... ] ]
```

## Parameters
<a name="r_CREATE_SCHEMA-parameters"></a>

 IF NOT EXISTS   
Clause that indicates that if the specified schema already exists, the command should make no changes and return a message that the schema exists, rather than terminating with an error.  
This clause is useful when scripting, so the script doesn’t fail if CREATE SCHEMA tries to create a schema that already exists.

 *schema\$1name*   
Name of the new schema. The schema name can't be `PUBLIC`. For more information about valid names, see [Names and identifiers](r_names.md).  
The list of schemas in the [search\$1path](r_search_path.md) configuration parameter determines the precedence of identically named objects when they are referenced without schema names.

AUTHORIZATION   
Clause that gives ownership to a specified user.

 *username*   
Name of the schema owner.

 *schema\$1element*   
Definition for one or more objects to be created within the schema.

QUOTA  
The maximum amount of disk space that the specified schema can use. This space is the collective disk usage. It includes all permanent tables, materialized views under the specified schema, and duplicate copies of all tables with ALL distribution on each compute node. The schema quota doesn't take into account temporary tables created as part of a temporary namespace or schema.   
To view the configured schema quotas, see [SVV\$1SCHEMA\$1QUOTA\$1STATE](r_SVV_SCHEMA_QUOTA_STATE.md).  
To view the records where schema quotas were exceeded, see [STL\$1SCHEMA\$1QUOTA\$1VIOLATIONS](r_STL_SCHEMA_QUOTA_VIOLATIONS.md).  
Amazon Redshift converts the selected value to megabytes. Gigabytes is the default unit of measurement when you don't specify a value.  
You must be a database superuser to set and change a schema quota. A user that is not a superuser but that has CREATE SCHEMA permission can create a schema with a defined quota. When you create a schema without defining a quota, the schema has an unlimited quota. When you set the quota below the current value used by the schema, Amazon Redshift doesn't allow further ingestion until you free disk space. A DELETE statement deletes data from a table and disk space is freed up only when VACUUM runs.   
Amazon Redshift checks each transaction for quota violations before committing the transaction. Amazon Redshift checks the size (the disk space used by all tables in a schema) of each modified schema against the set quota. Because the quota violation check occurs at the end of a transaction, the size limit can exceed the quota temporarily within a transaction before it's committed. When a transaction exceeds the quota, Amazon Redshift stops the transaction, prohibits subsequent ingestions, and reverts all the changes until you free disk space. Due to background VACUUM and internal cleanup, it is possible that a schema isn't full by the time that you check the schema after a canceled transaction.   
As an exception, Amazon Redshift disregards the quota violation and commits transactions in certain cases. Amazon Redshift does this for transactions that consist solely of one or more of the following statements where there isn't an INSERT or COPY ingestion statement in the same transaction:  
+ DELETE
+ TRUNCATE
+ VACUUM
+ DROP TABLE
+ ALTER TABLE APPEND only when moving data from the full schema to another non-full schema

 *UNLIMITED*   
Amazon Redshift imposes no limit to the growth of the total size of the schema.

## Limits
<a name="r_CREATE_SCHEMA-limit"></a>

Amazon Redshift enforces the following limits for schemas.
+ There is a maximum of 9900 schemas per database.

## Examples
<a name="r_CREATE_SCHEMA-examples"></a>

The following example creates a schema named US\$1SALES and gives ownership to the user DWUSER.

```
create schema us_sales authorization dwuser;
```

The following example creates a schema named US\$1SALES, gives ownership to the user DWUSER, and sets the quota to 50 GB.

```
create schema us_sales authorization dwuser QUOTA 50 GB;
```

To view the new schema, query the PG\$1NAMESPACE catalog table as shown following.

```
select nspname as schema, usename as owner
from pg_namespace, pg_user
where pg_namespace.nspowner = pg_user.usesysid
and pg_user.usename ='dwuser';

   schema |  owner
----------+----------
 us_sales | dwuser
(1 row)
```

The following example either creates the US\$1SALES schema, or does nothing and returns a message if it already exists.

```
create schema if not exists us_sales;
```

# CREATE TABLE
<a name="r_CREATE_TABLE_NEW"></a>

Creates a new table in the current database. You define a list of columns, which each hold data of a distinct type. The owner of the table is the issuer of the CREATE TABLE command.

## Required privileges
<a name="r_CREATE_TABLE-privileges"></a>

Following are required privileges for CREATE TABLE:
+ Superuser
+ Users with the CREATE TABLE privilege

## Syntax
<a name="r_CREATE_TABLE_NEW-synopsis"></a>

```
CREATE [ [LOCAL ] { TEMPORARY | TEMP } ] TABLE
[ IF NOT EXISTS ] table_name
( { column_name data_type [column_attributes] [ column_constraints ]
  | table_constraints
  | LIKE parent_table [ { INCLUDING | EXCLUDING } DEFAULTS ] }
  [, ... ]  )
[ BACKUP { YES | NO } ]
[table_attributes]

where column_attributes are:
  [ DEFAULT default_expr ]
  [ IDENTITY ( seed, step ) ]
  [ GENERATED BY DEFAULT AS IDENTITY ( seed, step ) ]
  [ ENCODE encoding ]
  [ DISTKEY ]
  [ SORTKEY ]
  [ COLLATE { CASE_SENSITIVE | CS | CASE_INSENSITIVE | CI } ]

and column_constraints are:
  [ { NOT NULL | NULL } ]
  [ { UNIQUE  |  PRIMARY KEY } ]
  [ REFERENCES reftable [ ( refcolumn ) ] ]

and table_constraints  are:
  [ UNIQUE ( column_name [, ... ] ) ]
  [ PRIMARY KEY ( column_name [, ... ] )  ]
  [ FOREIGN KEY (column_name [, ... ] ) REFERENCES reftable [ ( refcolumn ) ]


and table_attributes are:
  [ DISTSTYLE { AUTO | EVEN | KEY | ALL } ]
  [ DISTKEY ( column_name ) ]
  [ [COMPOUND | INTERLEAVED ] SORTKEY ( column_name [,...]) |  [ SORTKEY AUTO ] ]
  [ ENCODE AUTO ]
```

## Parameters
<a name="r_CREATE_TABLE_NEW-parameters"></a>

LOCAL   
Optional. Although this keyword is accepted in the statement, it has no effect in Amazon Redshift.

TEMPORARY \$1 TEMP   
Keyword that creates a temporary table that is visible only within the current session. The table is automatically dropped at the end of the session in which it is created. The temporary table can have the same name as a permanent table. The temporary table is created in a separate, session-specific schema. (You can't specify a name for this schema.) This temporary schema becomes the first schema in the search path, so the temporary table takes precedence over the permanent table unless you qualify the table name with the schema name to access the permanent table. For more information about schemas and precedence, see [search\$1path](r_search_path.md).  
By default, database users have permission to create temporary tables by their automatic membership in the PUBLIC group. To deny this privilege to a user, revoke the TEMP privilege from the PUBLIC group, and then explicitly grant the TEMP privilege only to specific users or groups of users.

IF NOT EXISTS  
Clause that indicates that if the specified table already exists, the command should make no changes and return a message that the table exists, rather than stopping with an error. Note that the existing table might be nothing like the one that would have been created; only the table name is compared.  
This clause is useful when scripting, so the script doesn’t fail if CREATE TABLE tries to create a table that already exists.

 *table\$1name*   
Name of the table to be created.  
If you specify a table name that begins with '\$1 ', the table is created as a temporary table. The following is an example:  

```
create table #newtable (id int);
```
You also reference the table with the '\$1 '. For example:   

```
select * from #newtable;
```
The maximum length for the table name is 127 bytes; longer names are truncated to 127 bytes. You can use UTF-8 multibyte characters up to a maximum of four bytes. Amazon Redshift enforces a quota of the number of tables per cluster by node type, including user-defined temporary tables and temporary tables created by Amazon Redshift during query processing or system maintenance. Optionally, the table name can be qualified with the database and schema name. In the following example, the database name is `tickit`, the schema name is `public`, and the table name is `test`.   

```
create table tickit.public.test (c1 int);
```
If the database or schema doesn't exist, the table isn't created, and the statement returns an error. You can't create tables or views in the system databases `template0`, `template1`, `padb_harvest` , or `sys:internal`.  
If a schema name is given, the new table is created in that schema (assuming the creator has access to the schema). The table name must be a unique name for that schema. If no schema is specified, the table is created by using the current database schema. If you are creating a temporary table, you can't specify a schema name, because temporary tables exist in a special schema.  
Multiple temporary tables with the same name can exist at the same time in the same database if they are created in separate sessions because the tables are assigned to different schemas. For more information about valid names, see [Names and identifiers](r_names.md).

 *column\$1name*   
Name of a column to be created in the new table. The maximum length for the column name is 127 bytes; longer names are truncated to 127 bytes. You can use UTF-8 multibyte characters up to a maximum of four bytes. The maximum number of columns you can define in a single table is 1,600. For more information about valid names, see [Names and identifiers](r_names.md).  
If you are creating a "wide table," take care that your list of columns doesn't exceed row-width boundaries for intermediate results during loads and query processing. For more information, see [Usage notes](#r_CREATE_TABLE_usage).

 *data\$1type*   
Data type of the column being created. For CHAR and VARCHAR columns, you can use the MAX keyword instead of declaring a maximum length. MAX sets the maximum length to 4,096 bytes for CHAR or 65535 bytes for VARCHAR. The maximum size of a GEOMETRY object is 1,048,447 bytes.  
For information about the data types that Amazon Redshift supports, see [Data types](c_Supported_data_types.md).

DEFAULT *default\$1expr*   <a name="create-table-default"></a>
Clause that assigns a default data value for the column. The data type of *default\$1expr* must match the data type of the column. The DEFAULT value must be a variable-free expression. Subqueries, cross-references to other columns in the current table, and user-defined functions aren't allowed.  
The *default\$1expr* expression is used in any INSERT operation that doesn't specify a value for the column. If no default value is specified, the default value for the column is null.  
If a COPY operation with a defined column list omits a column that has a DEFAULT value, the COPY command inserts the value of *default\$1expr*.

IDENTITY(*seed*, *step*)   <a name="identity-clause"></a>
Clause that specifies that the column is an IDENTITY column. An IDENTITY column contains unique autogenerated values. The data type for an IDENTITY column must be either INT or BIGINT.   
When you add rows using an `INSERT` or `INSERT INTO [tablename] VALUES()` statement, these values start with the value specified as *seed* and increment by the number specified as *step*.   
When you load the table using an `INSERT INTO [tablename] SELECT * FROM` or `COPY` statement, the data is loaded in parallel and distributed to the node slices. To be sure that the identity values are unique, Amazon Redshift skips a number of values when creating the identity values. Identity values are unique, but the order might not match the order in the source files. 

GENERATED BY DEFAULT AS IDENTITY(*seed*, *step*)   <a name="identity-generated-bydefault-clause"></a>
Clause that specifies that the column is a default IDENTITY column and enables you to automatically assign a unique value to the column. The data type for an IDENTITY column must be either INT or BIGINT. When you add rows without values, these values start with the value specified as *seed* and increment by the number specified as *step*. For information about how values are generated, see [IDENTITY](#identity-clause) .  
Also, during INSERT, UPDATE, or COPY you can provide a value without EXPLICIT\$1IDS. Amazon Redshift uses that value to insert into the identity column instead of using the system-generated value. The value can be a duplicate, a value less than the seed, or a value between step values. Amazon Redshift doesn't check the uniqueness of values in the column. Providing a value doesn't affect the next system-generated value.  
If you require uniqueness in the column, don't add a duplicate value. Instead, add a unique value that is less than the seed or between step values.
Keep in mind the following about default identity columns:   
+ Default identity columns are NOT NULL. NULL can't be inserted.
+ To insert a generated value into a default identity column, use the keyword `DEFAULT`. 

  ```
  INSERT INTO tablename (identity-column-name) VALUES (DEFAULT);
  ```
+ Overriding values of a default identity column doesn't affect the next generated value. 
+ You can't add a default identity column with the ALTER TABLE ADD COLUMN statement. 
+ You can append a default identity column with the ALTER TABLE APPEND statement. 

ENCODE *encoding*   
The compression encoding for a column. ENCODE AUTO is the default for tables. Amazon Redshift automatically manages compression encoding for all columns in the table. If you specify compression encoding for any column in the table, the table is no longer set to ENCODE AUTO. Amazon Redshift no longer automatically manages compression encoding for all columns in the table. You can specify the ENCODE AUTO option for the table to enable Amazon Redshift to automatically manage compression encoding for all columns in the table.  
  
Amazon Redshift automatically assigns an initial compression encoding to columns for which you don't specify compression encoding as follows:  
+ All columns in temporary tables are assigned RAW compression by default.
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, DOUBLE PRECISION, GEOMETRY, or GEOGRAPHY data type are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR, VARCHAR, or VARBYTE are assigned LZO compression.
If you don't want a column to be compressed, explicitly specify RAW encoding.
 The following [compression encodings](c_Compression_encodings.md#compression-encoding-list) are supported:  
+ AZ64
+ BYTEDICT
+ DELTA
+ DELTA32K
+ LZO
+ MOSTLY8
+ MOSTLY16
+ MOSTLY32
+ RAW (no compression)
+ RUNLENGTH
+ TEXT255
+ TEXT32K
+ ZSTD

DISTKEY  
Keyword that specifies that the column is the distribution key for the table. Only one column in a table can be the distribution key. You can use the DISTKEY keyword after a column name or as part of the table definition by using the DISTKEY (*column\$1name*) syntax. Either method has the same effect. For more information, see the DISTSTYLE parameter later in this topic.  
The data type of a distribution key column can be: BOOLEAN, REAL, DOUBLE PRECISION, SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ, CHAR, or VARCHAR.

SORTKEY  
Keyword that specifies that the column is the sort key for the table. When data is loaded into the table, the data is sorted by one or more columns that are designated as sort keys. You can use the SORTKEY keyword after a column name to specify a single-column sort key, or you can specify one or more columns as sort key columns for the table by using the SORTKEY (*column\$1name* [, ...]) syntax. Only compound sort keys are created with this syntax.  
You can define a maximum of 400 SORTKEY columns per table.  
The data type of a sort key column can be: BOOLEAN, REAL, DOUBLE PRECISION, SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ, CHAR, or VARCHAR.

COLLATE \$1 CASE\$1SENSITIVE \$1 CS \$1 CASE\$1INSENSITIVE \$1 CI \$1  
A clause that specifies whether string search or comparison on the column is case sensitive or case insensitive. The default value is the same as the current case sensitivity configuration of the database.  
COLLATE is supported only on string-based data types, including CHAR, VARCHAR, and string values within SUPER columns. For details on case-insensitive querying of SUPER data, see [Case-insensitive querying](query-super.md#case-insensitive-super-queries).  
To find the database collation information, use the following command:  

```
SELECT db_collation();
                     
db_collation
----------------
 case_sensitive
(1 row)
```
CASE\$1SENSITIVE and CS are interchangeable and yield the same results. Similarly, CASE\$1INSENSITIVE and CI are interchangeable and yield the same results.

NOT NULL \$1 NULL   
NOT NULL specifies that the column isn't allowed to contain null values. NULL, the default, specifies that the column accepts null values. IDENTITY columns are declared NOT NULL by default.

UNIQUE  
Keyword that specifies that the column can contain only unique values. The behavior of the unique table constraint is the same as that for column constraints, with the additional capability to span multiple columns. To define a unique table constraint, use the UNIQUE ( *column\$1name* [, ... ] ) syntax.  
Unique constraints are informational and aren't enforced by the system.

PRIMARY KEY  
Keyword that specifies that the column is the primary key for the table. Only one column can be defined as the primary key by using a column definition. To define a table constraint with a multiple-column primary key, use the PRIMARY KEY ( *column\$1name* [, ... ] ) syntax.  
Identifying a column as the primary key provides metadata about the design of the schema. A primary key implies that other tables can rely on this set of columns as a unique identifier for rows. One primary key can be specified for a table, whether as a column constraint or a table constraint. The primary key constraint should name a set of columns that is different from other sets of columns named by any unique constraint defined for the same table.  
PRIMARY KEY columns are also defined as NOT NULL.  
Primary key constraints are informational only. They aren't enforced by the system, but they are used by the planner.

References *reftable* [ ( *refcolumn* ) ]  
Clause that specifies a foreign key constraint, which implies that the column must contain only values that match values in the referenced column of some row of the referenced table. The referenced columns should be the columns of a unique or primary key constraint in the referenced table.   
 Foreign key constraints are informational only. They aren't enforced by the system, but they are used by the planner. 

LIKE *parent\$1table* [ \$1 INCLUDING \$1 EXCLUDING \$1 DEFAULTS ]   <a name="create-table-like"></a>
A clause that specifies an existing table from which the new table automatically copies column names, data types, and NOT NULL constraints. The new table and the parent table are decoupled, and any changes made to the parent table aren't applied to the new table. Default expressions for the copied column definitions are copied only if INCLUDING DEFAULTS is specified. The default behavior is to exclude default expressions, so that all columns of the new table have null defaults.   
Tables created with the LIKE option don't inherit primary and foreign key constraints. Distribution style, sort keys, BACKUP, and NULL properties are inherited by LIKE tables, but you can't explicitly set them in the CREATE TABLE ... LIKE statement.

BACKUP \$1 YES \$1 NO \$1   <a name="create-table-backup"></a>
A clause that specifies whether the table should be included in automated and manual cluster snapshots.   
For tables, such as staging tables, that don't contain critical data, specify BACKUP NO to save processing time when creating snapshots and restoring from snapshots and to reduce storage space on Amazon Simple Storage Service. The BACKUP NO setting has no affect on automatic replication of data to other nodes within the cluster, so tables with BACKUP NO specified are restored in the event of a node failure. The default is BACKUP YES.  
No-backup tables aren't supported for RA3 provisioned clusters and Amazon Redshift Serverless workgroups. A table marked as no-backup in an RA3 cluster or serverless workgroup is treated as a permanent table that will always be backed up while taking a snapshot, and always restored when restoring from a snapshot. To avoid snapshot costs for no-backup tables, truncate them before taking a snapshot.

DISTSTYLE \$1 AUTO \$1 EVEN \$1 KEY \$1 ALL \$1  
Keyword that defines the data distribution style for the whole table. Amazon Redshift distributes the rows of a table to the compute nodes according to the distribution style specified for the table. The default is AUTO.  
The distribution style that you select for tables affects the overall performance of your database. For more information, see [Data distribution for query optimization](t_Distributing_data.md). Possible distribution styles are as follows:  
+ AUTO: Amazon Redshift assigns an optimal distribution style based on the table data. For example, if AUTO distribution style is specified, Amazon Redshift initially assigns the ALL distribution style to a small table. When the table grows larger, Amazon Redshift might change the distribution style to KEY, choosing the primary key (or a column of the composite primary key) as the DISTKEY. If the table grows larger and none of the columns are suitable to be the DISTKEY, Amazon Redshift changes the distribution style to EVEN. The change in distribution style occurs in the background with minimal impact to user queries. 

  To view the distribution style applied to a table, query the PG\$1CLASS system catalog table. For more information, see [Viewing distribution styles](viewing-distribution-styles.md). 
+ EVEN: The data in the table is spread evenly across the nodes in a cluster in a round-robin distribution. Row IDs are used to determine the distribution, and roughly the same number of rows are distributed to each node. 
+ KEY: The data is distributed by the values in the DISTKEY column. When you set the joining columns of joining tables as distribution keys, the joining rows from both tables are collocated on the compute nodes. When data is collocated, the optimizer can perform joins more efficiently. If you specify DISTSTYLE KEY, you must name a DISTKEY column, either for the table or as part of the column definition. For more information, see the DISTKEY parameter earlier in this topic.
+  ALL: A copy of the entire table is distributed to every node. This distribution style ensures that all the rows required for any join are available on every node, but it multiplies storage requirements and increases the load and maintenance times for the table. ALL distribution can improve execution time when used with certain dimension tables where KEY distribution isn't appropriate, but performance improvements must be weighed against maintenance costs. 

DISTKEY ( *column\$1name* )  
Constraint that specifies the column to be used as the distribution key for the table. You can use the DISTKEY keyword after a column name or as part of the table definition, by using the DISTKEY (*column\$1name*) syntax. Either method has the same effect. For more information, see the DISTSTYLE parameter earlier in this topic.

[COMPOUND \$1 INTERLEAVED ] SORTKEY (* column\$1name* [,...]) \$1 [ SORTKEY AUTO ]  
Specifies one or more sort keys for the table. When data is loaded into the table, the data is sorted by the columns that are designated as sort keys. You can use the SORTKEY keyword after a column name to specify a single-column sort key, or you can specify one or more columns as sort key columns for the table by using the `SORTKEY (column_name [ , ... ] )` syntax.   
You can optionally specify COMPOUND or INTERLEAVED sort style. If you specify SORTKEY with columns the default is COMPOUND. For more information, see [Sort keys](t_Sorting_data.md).  
If you don't specify any sort keys options, the default is AUTO.  
You can define a maximum of 400 COMPOUND SORTKEY columns or 8 INTERLEAVED SORTKEY columns per table.     
AUTO  
Specifies that Amazon Redshift assigns an optimal sort key based on the table data. For example, if AUTO sort key is specified, Amazon Redshift initially assigns no sort key to a table. If Amazon Redshift determines that a sort key will improve the performance of queries, then Amazon Redshift might change the sort key of your table. The actual sorting of the table is done by automatic table sort. For more information, see [Automatic table sort](t_Reclaiming_storage_space202.md#automatic-table-sort).   
Amazon Redshift doesn't modify tables that have existing sort or distribution keys. With one exception, if a table has a distribution key that has never been used in a JOIN, then the key might be changed if Amazon Redshift determines there is a better key.   
To view the sort key of a table, query the SVV\$1TABLE\$1INFO system catalog view. For more information, see [SVV\$1TABLE\$1INFO](r_SVV_TABLE_INFO.md). To view the Amazon Redshift Advisor recommendations for tables, query the SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS system catalog view. For more information, see [SVV\$1ALTER\$1TABLE\$1RECOMMENDATIONS](r_SVV_ALTER_TABLE_RECOMMENDATIONS.md). To view the actions taken by Amazon Redshift, query the SVL\$1AUTO\$1WORKER\$1ACTION system catalog view. For more information, see [SVL\$1AUTO\$1WORKER\$1ACTION](r_SVL_AUTO_WORKER_ACTION.md).   
COMPOUND  
Specifies that the data is sorted using a compound key made up of all of the listed columns, in the order they are listed. A compound sort key is most useful when a query scans rows according to the order of the sort columns. The performance benefits of sorting with a compound key decrease when queries rely on secondary sort columns. You can define a maximum of 400 COMPOUND SORTKEY columns per table.   
INTERLEAVED  
Specifies that the data is sorted using an interleaved sort key. A maximum of eight columns can be specified for an interleaved sort key.   
An interleaved sort gives equal weight to each column, or subset of columns, in the sort key, so queries don't depend on the order of the columns in the sort key. When a query uses one or more secondary sort columns, interleaved sorting significantly improves query performance. Interleaved sorting carries a small overhead cost for data loading and vacuuming operations.   
Don’t use an interleaved sort key on columns with monotonically increasing attributes, such as identity columns, dates, or timestamps.

ENCODE AUTO   
Enables Amazon Redshift to automatically adjust the encoding type for all columns in the table to optimize query performance. ENCODE AUTO preserves the initial encode types that you specify in creating the table. Then, if Amazon Redshift determines that a new encoding type can improve query performance, Amazon Redshift can change the encoding type of the table columns. ENCODE AUTO is the default if you don't specify an encoding type on any column in the table.

UNIQUE ( *column\$1name* [,...] )  
Constraint that specifies that a group of one or more columns of a table can contain only unique values. The behavior of the unique table constraint is the same as that for column constraints, with the additional capability to span multiple columns. In the context of unique constraints, null values aren't considered equal. Each unique table constraint must name a set of columns that is different from the set of columns named by any other unique or primary key constraint defined for the table.   
 Unique constraints are informational and aren't enforced by the system. 

PRIMARY KEY ( *column\$1name* [,...] )  
Constraint that specifies that a column or a number of columns of a table can contain only unique (nonduplicate) non-null values. Identifying a set of columns as the primary key also provides metadata about the design of the schema. A primary key implies that other tables can rely on this set of columns as a unique identifier for rows. One primary key can be specified for a table, whether as a single column constraint or a table constraint. The primary key constraint should name a set of columns that is different from other sets of columns named by any unique constraint defined for the same table.   
 Primary key constraints are informational only. They aren't enforced by the system, but they are used by the planner. 

FOREIGN KEY ( *column\$1name* [, ... ] ) REFERENCES *reftable* [ ( *refcolumn* ) ]   
Constraint that specifies a foreign key constraint, which requires that a group of one or more columns of the new table must only contain values that match values in the referenced column or columns of some row of the referenced table. If *refcolumn* is omitted, the primary key of *reftable* is used. The referenced columns must be the columns of a unique or primary key constraint in the referenced table.  
Foreign key constraints are informational only. They aren't enforced by the system, but they are used by the planner.

## Usage notes
<a name="r_CREATE_TABLE_usage"></a>

Uniqueness, primary key, and foreign key constraints are informational only; *they are not enforced by Amazon Redshift* when you populate a table. For example, if you insert data into a table with dependencies, the insert can succeed even if it violates the constraint. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity. For information about how to drop a table with dependencies, see [DROP TABLE](r_DROP_TABLE.md).

### Limits and quotas
<a name="r_CREATE_TABLE_usage-limits"></a>

Consider the following limits when you create a table.
+ There is a limit for the maximum number of tables in a cluster by node type. For more information, see [Limits](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*. 
+ The maximum number of characters for a table name is 127. 
+ The maximum number of columns you can define in a single table is 1,600. 
+ The maximum number of SORTKEY columns you can define in a single table is 400. 

### Summary of column-level settings and table-level settings
<a name="r_CREATE_TABLE_usage-summary_of_settings"></a>

 Several attributes and settings can be set at the column level or at the table level. In some cases, setting an attribute or constraint at the column level or at the table level has the same effect. In other cases, they produce different results. 

 The following list summarizes column-level and table-level settings: 

DISTKEY  
There is no difference in effect whether set at the column level or at the table level.   
If DISTKEY is set, either at the column level or at the table level, DISTSTYLE must be set to KEY or not set at all. DISTSTYLE can be set only at the table level. 

SORTKEY  
If set at the column level, SORTKEY must be a single column. If SORTKEY is set at the table level, one or more columns can make up a compound or interleaved composite sort key. 

COLLATE CASE\$1SENSITIVE \$1 COLLATE CASE\$1INSENSITIVE  
Amazon Redshift doesn't support altering case sensitivity configuration for a column. When you append a new column to the table, Amazon Redshift uses the default value for case sensitivity. Amazon Redshift doesn't support the COLLATE key word when appending a new column.  
For information on how to create databases using database collation, see [CREATE DATABASE](r_CREATE_DATABASE.md).  
For information on the COLLATE function, see [COLLATE function](r_COLLATE.md).

UNIQUE  
At the column level, one or more keys can be set to UNIQUE; the UNIQUE constraint applies to each column individually. If UNIQUE is set at the table level, one or more columns can make up a composite UNIQUE constraint. 

PRIMARY KEY  
If set at the column level, PRIMARY KEY must be a single column. If PRIMARY KEY is set at the table level, one or more columns can make up a composite primary key . 

FOREIGN KEY  
There is no difference in effect whether FOREIGN KEY is set at the column level or at the table level. At the column level, the syntax is simply `REFERENCES` *reftable* [ ( *refcolumn* )]. 

### Distribution of incoming data
<a name="r_CREATE_TABLE_usage-distribution-of-incoming-data"></a>

When the hash distribution scheme of the incoming data matches that of the target table, no physical distribution of the data is actually necessary when the data is loaded. For example, if a distribution key is set for the new table and the data is being inserted from another table that is distributed on the same key column, the data is loaded in place, using the same nodes and slices. However, if the source and target tables are both set to EVEN distribution, data is redistributed into the target table.

### Wide tables
<a name="r_CREATE_TABLE_usage-wide-tables"></a>

You might be able to create a very wide table but be unable to perform query processing, such as INSERT or SELECT statements, on the table. The maximum width of a table with fixed width columns, such as CHAR, is 64KB - 1 (or 65535 bytes). If a table includes VARCHAR columns, the table can have a larger declared width without returning an error because VARCHARS columns don't contribute their full declared width to the calculated query-processing limit. The effective query-processing limit with VARCHAR columns will vary based on a number of factors.

If a table is too wide for inserting or selecting, you receive the following error.

```
ERROR:  8001
DETAIL:  The combined length of columns processed in the SQL statement
exceeded the query-processing limit of 65535 characters (pid:7627)
```

## Examples
<a name="r_CREATE_TABLE_usage-examples"></a>

For examples that show how to use the CREATE TABLE command, see the [Examples](r_CREATE_TABLE_examples.md) topic.

# Examples
<a name="r_CREATE_TABLE_examples"></a>

The following examples demonstrate various column and table attributes in Amazon Redshift CREATE TABLE statements. For more information about CREATE TABLE, including parameter definitions, see [CREATE TABLE](r_CREATE_TABLE_NEW.md).

Many of the examples use tables and data from the *TICKIT* sample data set. For more information, see [Sample database](https://docs.aws.amazon.com/redshift/latest/dg/c_sampledb.html).

 You can prefix the table name with the database name and schema name in a CREATE TABLE command. For instance, `dev_database.public.sales`. The database name must be the database you are connected to. Any attempt to create database objects in another database fails with and invalid-operation error.

## Create a table with a distribution key, a compound sort key, and compression
<a name="r_CREATE_TABLE_examples-create-a-table-with-distribution-key"></a>

The following example creates a SALES table in the TICKIT database with compression defined for several columns. LISTID is declared as the distribution key, and LISTID and SELLERID are declared as a multicolumn compound sort key. Primary key and foreign key constraints are also defined for the table. Prior to creating the table in the example, you might need to add a UNIQUE constraint to each column referenced by a foreign key, if constraints don't exist.

```
create table sales(
salesid integer not null,
listid integer not null,
sellerid integer not null,
buyerid integer not null,
eventid integer not null encode mostly16,
dateid smallint not null,
qtysold smallint not null encode mostly8,
pricepaid decimal(8,2) encode delta32k,
commission decimal(8,2) encode delta32k,
saletime timestamp,
primary key(salesid),
foreign key(listid) references listing(listid),
foreign key(sellerid) references users(userid),
foreign key(buyerid) references users(userid),
foreign key(dateid) references date(dateid))
distkey(listid)
compound sortkey(listid,sellerid);
```

The results follow:

```
schemaname | tablename | column     | type                        | encoding | distkey | sortkey | notnull
-----------+-----------+------------+-----------------------------+----------+---------+---------+--------
public     | sales     | salesid    | integer                     | lzo      | false   |       0 | true
public     | sales     | listid     | integer                     | none     | true    |       1 | true
public     | sales     | sellerid   | integer                     | none     | false   |       2 | true
public     | sales     | buyerid    | integer                     | lzo      | false   |       0 | true
public     | sales     | eventid    | integer                     | mostly16 | false   |       0 | true
public     | sales     | dateid     | smallint                    | lzo      | false   |       0 | true
public     | sales     | qtysold    | smallint                    | mostly8  | false   |       0 | true
public     | sales     | pricepaid  | numeric(8,2)                | delta32k | false   |       0 | false
public     | sales     | commission | numeric(8,2)                | delta32k | false   |       0 | false
public     | sales     | saletime   | timestamp without time zone | lzo      | false   |       0 | false
```

The following example creates table t1 with a case-insensitive column col1.

```
create table T1 (
  col1 Varchar(20) collate case_insensitive
 );
            
insert into T1 values ('bob'), ('john'), ('Tom'), ('JOHN'), ('Bob');
```

Query the table:

```
select * from T1 where col1 = 'John';
   
col1
------
 john
 JOHN
(2 rows)
```

## Create a table using an interleaved sort key
<a name="CREATE_TABLE_NEW-create-a-table-using-interleaved-sortkey"></a>

The following example creates the CUSTOMER table with an interleaved sort key.

```
create table customer_interleaved (
  c_custkey     	integer        not null,
  c_name        	varchar(25)    not null,
  c_address     	varchar(25)    not null,
  c_city        	varchar(10)    not null,
  c_nation      	varchar(15)    not null,
  c_region      	varchar(12)    not null,
  c_phone       	varchar(15)    not null,
  c_mktsegment      varchar(10)    not null)
diststyle all
interleaved sortkey (c_custkey, c_city, c_mktsegment);
```

## Create a table using IF NOT EXISTS
<a name="CREATE_TABLE_NEW-create-a-table-using-if-not-exists"></a>

 The following example either creates the CITIES table, or does nothing and returns a message if it already exists:

```
create table if not exists cities(
cityid integer not null,
city varchar(100) not null,
state char(2) not null);
```

## Create a table with ALL distribution
<a name="CREATE_TABLE_NEW-create-a-table-with-all-distribution"></a>

 The following example creates the VENUE table with ALL distribution. 

```
create table venue(
venueid smallint not null,
venuename varchar(100),
venuecity varchar(30),
venuestate char(2),
venueseats integer,
primary key(venueid))
diststyle all;
```

## Create a Table with EVEN distribution
<a name="r_CREATE_TABLE_NEW-create-a-table-with-default-even-distribution"></a>

The following example creates a table called MYEVENT with three columns. 

```
create table myevent(
eventid int,
eventname varchar(200),
eventcity varchar(30))
diststyle even;
```

The table is distributed evenly and isn't sorted. The table has no declared DISTKEY or SORTKEY columns. 

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'myevent';
            
  column   |          type          | encoding | distkey | sortkey
-----------+------------------------+----------+---------+---------
 eventid   | integer                | lzo      | f       |       0
 eventname | character varying(200) | lzo      | f       |       0
 eventcity | character varying(30)  | lzo      | f       |       0
(3 rows)
```

## Create a temporary table that is LIKE another table
<a name="r_CREATE_TABLE_NEW-create-a-temporary-table-that-is-like-another-table"></a>

The following example creates a temporary table called TEMPEVENT, which inherits its columns from the EVENT table. 

```
create temp table tempevent(like event); 
```

This table also inherits the DISTKEY and SORTKEY attributes of its parent table: 

```
select "column", type, encoding, distkey, sortkey
 from pg_table_def where tablename = 'tempevent';

  column   |            type             | encoding | distkey | sortkey
-----------+-----------------------------+----------+---------+---------
 eventid   | integer                     | none     | t       |       1
 venueid   | smallint                    | none     | f       |       0
 catid     | smallint                    | none     | f       |       0
 dateid    | smallint                    | none     | f       |       0
 eventname | character varying(200)      | lzo      | f       |       0
 starttime | timestamp without time zone | bytedict | f       |       0
(6 rows)
```

## Create a table with an IDENTITY column
<a name="r_CREATE_TABLE_NEW-create-a-table-with-an-identity-column"></a>

The following example creates a table named VENUE\$1IDENT, which has an IDENTITY column named VENUEID. This column starts with 0 and increments by 1 for each record. VENUEID is also declared as the primary key of the table. 

```
create table venue_ident(venueid bigint identity(0, 1),
venuename varchar(100),
venuecity varchar(30),
venuestate char(2),
venueseats integer,
primary key(venueid));
```

## Create a table with a default IDENTITY column
<a name="r_CREATE_TABLE_NEW-create-a-table-with-default-identity-column"></a>

The following example creates a table named `t1`. This table has an IDENTITY column named `hist_id` and a default IDENTITY column named `base_id`. 

```
CREATE TABLE t1(
  hist_id BIGINT IDENTITY NOT NULL, /* Cannot be overridden */
  base_id BIGINT GENERATED BY DEFAULT AS IDENTITY NOT NULL, /* Can be overridden */
  business_key varchar(10) ,
  some_field varchar(10)
);
```

Inserting a row into the table shows that both `hist_id` and `base_id` values are generated. 

```
INSERT INTO T1 (business_key, some_field) values ('A','MM');
```

```
SELECT * FROM t1;

 hist_id | base_id | business_key | some_field
---------+---------+--------------+------------
       1 |       1 | A            | MM
```

Inserting a second row shows that the default value for `base_id` is generated.

```
INSERT INTO T1 (base_id, business_key, some_field) values (DEFAULT, 'B','MNOP');
```

```
SELECT * FROM t1;

 hist_id | base_id | business_key | some_field
---------+---------+--------------+------------
       1 |       1 | A            | MM
       2 |       2 | B            | MNOP
```

Inserting a third row shows that the value for `base_id` doesn't need to be unique.

```
INSERT INTO T1 (base_id, business_key, some_field) values (2,'B','MNNN');
```

```
SELECT * FROM t1;
            
 hist_id | base_id | business_key | some_field
---------+---------+--------------+------------
       1 |       1 | A            | MM
       2 |       2 | B            | MNOP
       3 |       2 | B            | MNNN
```

## Create a table with DEFAULT column values
<a name="r_CREATE_TABLE_NEW-create-a-table-with-default-column-values"></a>

The following example creates a CATEGORYDEF table that declares default values for each column: 

```
create table categorydef(
catid smallint not null default 0,
catgroup varchar(10) default 'Special',
catname varchar(10) default 'Other',
catdesc varchar(50) default 'Special events',
primary key(catid));
            
insert into categorydef values(default,default,default,default);
```

```
select * from categorydef;
            
 catid | catgroup | catname |    catdesc
-------+----------+---------+----------------
     0 | Special  | Other   | Special events
(1 row)
```

## DISTSTYLE, DISTKEY, and SORTKEY options
<a name="r_CREATE_TABLE_NEW-diststyle-distkey-and-sortkey-options"></a>

The following example shows how the DISTKEY, SORTKEY, and DISTSTYLE options work. In this example, COL1 is the distribution key; therefore, the distribution style must be either set to KEY or not set. By default, the table has no sort key and so isn't sorted: 

```
create table t1(col1 int distkey, col2 int) diststyle key;
```

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 't1';

column |  type   | encoding | distkey | sortkey
-------+---------+----------+---------+---------
col1   | integer | az64     | t       | 0
col2   | integer | az64     | f       | 0
```

In the following example, the same column is defined as the distribution key and the sort key. Again, the distribution style must be either set to KEY or not set. 

```
create table t2(col1 int distkey sortkey, col2 int);
```

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 't2';
            
column |  type   | encoding | distkey | sortkey
-------+---------+----------+---------+---------
col1   | integer | none     | t       | 1
col2   | integer | az64     | f       | 0
```

In the following example, no column is set as the distribution key, COL2 is set as the sort key, and the distribution style is set to ALL: 

```
create table t3(col1 int, col2 int sortkey) diststyle all;
```

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 't3';
            
Column |  Type   | Encoding | DistKey | SortKey
-------+---------+----------+---------+--------
col1   | integer | az64     | f       | 0
col2   | integer | none     | f       | 1
```

In the following example, the distribution style is set to EVEN and no sort key is defined explicitly; therefore the table is distributed evenly but isn't sorted. 

```
create table t4(col1 int, col2 int) diststyle even;
```

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 't4';
            
             column |  type   |encoding | distkey | sortkey
--------+---------+---------+---------+--------
col1    | integer | az64    | f       | 0
col2    | integer | az64    | f       | 0
```

## Create a table with the ENCODE AUTO option
<a name="r_CREATE_TABLE_NEW-create-a-table-with-encode-option"></a>

The following example creates the table `t1` with automatic compression encoding. ENCODE AUTO is the default for tables when you don't specify an encoding type for any column.

```
create table t1(c0 int, c1 varchar);
```

The following example creates the table `t2` with automatic compression encoding by specifying ENCODE AUTO.

```
create table t2(c0 int, c1 varchar) encode auto;
```

The following example creates the table `t3` with automatic compression encoding by specifying ENCODE AUTO. Column `c0` is defined with an initial encoding type of DELTA. Amazon Redshift can change the encoding if another encoding provides better query performance.

```
create table t3(c0 int encode delta, c1 varchar) encode auto;
```

The following example creates the table `t4` with automatic compression encoding by specifying ENCODE AUTO. Column `c0` is defined with an initial encoding of DELTA, and column `c1` is defined with an initial encoding of LZO. Amazon Redshift can change these encodings if other encodings provide better query performance.

```
create table t4(c0 int encode delta, c1 varchar encode lzo) encode auto;
```

# CREATE TABLE AS
<a name="r_CREATE_TABLE_AS"></a>

**Topics**
+ [Syntax](#r_CREATE_TABLE_AS-synopsis)
+ [Parameters](#r_CREATE_TABLE_AS-parameters)
+ [CTAS usage notes](r_CTAS_usage_notes.md)
+ [CTAS examples](r_CTAS_examples.md)

Creates a new table based on a query. The owner of this table is the user that issues the command.

The new table is loaded with data defined by the query in the command. The table columns have names and data types associated with the output columns of the query. The CREATE TABLE AS (CTAS) command creates a new table and evaluates the query to load the new table.

## Syntax
<a name="r_CREATE_TABLE_AS-synopsis"></a>

```
CREATE [ [ LOCAL ] { TEMPORARY | TEMP } ]
TABLE table_name
[ ( column_name [, ... ] ) ]
[ BACKUP { YES | NO } ]
[ table_attributes ]
AS query

where table_attributes are:
[ DISTSTYLE { AUTO | EVEN | ALL | KEY } ]
[ DISTKEY( distkey_identifier ) ]
[ [ COMPOUND | INTERLEAVED ] SORTKEY( column_name [, ...] ) ]
```

## Parameters
<a name="r_CREATE_TABLE_AS-parameters"></a>

LOCAL   
Although this optional keyword is accepted in the statement, it has no effect in Amazon Redshift.

TEMPORARY \$1 TEMP   
Creates a temporary table. A temporary table is automatically dropped at the end of the session in which it was created.

 *table\$1name*   
The name of the table to be created.  
If you specify a table name that begins with '\$1 ', the table is created as a temporary table. For example:  

```
create table #newtable (id) as select * from oldtable;
```
The maximum table name length is 127 bytes; longer names are truncated to 127 bytes. Amazon Redshift enforces a quota of the number of tables per cluster by node type. The table name can be qualified with the database and schema name, as the following table shows.  

```
create table tickit.public.test (c1) as select * from oldtable;
```
In this example, `tickit` is the database name and `public` is the schema name. If the database or schema doesn't exist, the statement returns an error.  
If a schema name is given, the new table is created in that schema (assuming the creator has access to the schema). The table name must be a unique name for that schema. If no schema is specified, the table is created using the current database schema. If you are creating a temporary table, you can't specify a schema name, since temporary tables exist in a special schema.  
Multiple temporary tables with the same name are allowed to exist at the same time in the same database if they are created in separate sessions. These tables are assigned to different schemas.

 *column\$1name*   
The name of a column in the new table. If no column names are provided, the column names are taken from the output column names of the query. Default column names are used for expressions. For more information about valid names, see [Names and identifiers](r_names.md).

BACKUP \$1 YES \$1 NO \$1   
A clause that specifies whether the table should be included in automated and manual cluster snapshots.   
For tables, such as staging tables, that don't contain critical data, specify BACKUP NO to save processing time when creating snapshots and restoring from snapshots and to reduce storage space on Amazon Simple Storage Service. The BACKUP NO setting has no effect on automatic replication of data to other nodes within the cluster, so tables with BACKUP NO specified are restored in the event of a node failure. The default is BACKUP YES.  
No-backup tables aren't supported for RA3 provisioned clusters and Amazon Redshift Serverless workgroups. A table marked as no-backup in an RA3 cluster or serverless workgroup is treated as a permanent table that will always be backed up while taking a snapshot, and always restored when restoring from a snapshot. To avoid snapshot costs for no-backup tables, truncate them before taking a snapshot.

DISTSTYLE \$1 AUTO \$1 EVEN \$1 KEY \$1 ALL \$1  
Keyword that defines the data distribution style for the whole table. Amazon Redshift distributes the rows of a table to the compute nodes according to the distribution style specified for the table. The default is DISTSTYLE AUTO.  
The distribution style that you select for tables affects the overall performance of your database. For more information, see [Data distribution for query optimization](t_Distributing_data.md).  
+ AUTO: Amazon Redshift assigns an optimal distribution style based on the table data. To view the distribution style applied to a table, query the PG\$1CLASS system catalog table. For more information, see [Viewing distribution styles](viewing-distribution-styles.md). 
+ EVEN: The data in the table is spread evenly across the nodes in a cluster in a round-robin distribution. Row IDs are used to determine the distribution, and roughly the same number of rows are distributed to each node. This is the default distribution method.
+ KEY: The data is distributed by the values in the DISTKEY column. When you set the joining columns of joining tables as distribution keys, the joining rows from both tables are collocated on the compute nodes. When data is collocated, the optimizer can perform joins more efficiently. If you specify DISTSTYLE KEY, you must name a DISTKEY column.
+  ALL: A copy of the entire table is distributed to every node. This distribution style ensures that all the rows required for any join are available on every node, but it multiplies storage requirements and increases the load and maintenance times for the table. ALL distribution can improve execution time when used with certain dimension tables where KEY distribution isn't appropriate, but performance improvements must be weighed against maintenance costs. 

DISTKEY (*column*)  
Specifies a column name or positional number for the distribution key. Use the name specified in either the optional column list for the table or the select list of the query. Alternatively, use a positional number, where the first column selected is 1, the second is 2, and so on. Only one column in a table can be the distribution key:  
+ If you declare a column as the DISTKEY column, DISTSTYLE must be set to KEY or not set at all.
+ If you don't declare a DISTKEY column, you can set DISTSTYLE to EVEN.
+ If you don't specify DISTKEY or DISTSTYLE, CTAS determines the distribution style for the new table based on the query plan for the SELECT clause. For more information, see [Inheritance of column and table attributes](r_CTAS_usage_notes.md#r_CTAS_usage_notes-inheritance-of-column-and-table-attributes).
You can define the same column as the distribution key and the sort key; this approach tends to accelerate joins when the column in question is a joining column in the query.

[ COMPOUND \$1 INTERLEAVED ] SORTKEY ( *column\$1name* [, ... ] )  
Specifies one or more sort keys for the table. When data is loaded into the table, the data is sorted by the columns that are designated as sort keys.   
You can optionally specify COMPOUND or INTERLEAVED sort style. The default is COMPOUND. For more information, see [Sort keys](t_Sorting_data.md).  
You can define a maximum of 400 COMPOUND SORTKEY columns or 8 INTERLEAVED SORTKEY columns per table.   
If you don't specify SORTKEY, CTAS determines the sort keys for the new table based on the query plan for the SELECT clause. For more information, see [Inheritance of column and table attributes](r_CTAS_usage_notes.md#r_CTAS_usage_notes-inheritance-of-column-and-table-attributes).    
COMPOUND  
Specifies that the data is sorted using a compound key made up of all of the listed columns, in the order they are listed. A compound sort key is most useful when a query scans rows according to the order of the sort columns. The performance benefits of sorting with a compound key decrease when queries rely on secondary sort columns. You can define a maximum of 400 COMPOUND SORTKEY columns per table.   
INTERLEAVED  
Specifies that the data is sorted using an interleaved sort key. A maximum of eight columns can be specified for an interleaved sort key.   
An interleaved sort gives equal weight to each column, or subset of columns, in the sort key, so queries don't depend on the order of the columns in the sort key. When a query uses one or more secondary sort columns, interleaved sorting significantly improves query performance. Interleaved sorting carries a small overhead cost for data loading and vacuuming operations. 

AS *query*   
Any query (SELECT statement) that Amazon Redshift supports.

# CTAS usage notes
<a name="r_CTAS_usage_notes"></a>

## Limits
<a name="r_CTAS_usage_notes-limits"></a>

Amazon Redshift enforces a quota of the number of tables per cluster by node type. 

The maximum number of characters for a table name is 127. 

The maximum number of columns you can define in a single table is 1,600. 

## Inheritance of column and table attributes
<a name="r_CTAS_usage_notes-inheritance-of-column-and-table-attributes"></a>

CREATE TABLE AS (CTAS) tables don't inherit constraints, identity columns, default column values, or the primary key from the table that they were created from. 

You can't specify column compression encodings for CTAS tables. Amazon Redshift automatically assigns compression encoding as follows:
+ Columns that are defined as sort keys are assigned RAW compression.
+ Columns that are defined as BOOLEAN, REAL, DOUBLE PRECISION, GEOMETRY, or GEOGRAPHY data type are assigned RAW compression.
+ Columns that are defined as SMALLINT, INTEGER, BIGINT, DECIMAL, DATE, TIME, TIMETZ, TIMESTAMP, or TIMESTAMPTZ are assigned AZ64 compression.
+ Columns that are defined as CHAR, VARCHAR, or VARBYTE are assigned LZO compression.

For more information, see [Compression encodings](c_Compression_encodings.md) and [Data types](c_Supported_data_types.md). 

To explicitly assign column encodings, use [CREATE TABLE](r_CREATE_TABLE_NEW.md).

CTAS determines distribution style and sort key for the new table based on the query plan for the SELECT clause. 

For complex queries, such as queries that include joins, aggregations, an order by clause, or a limit clause, CTAS makes a best effort to choose the optimal distribution style and sort key based on the query plan. 

**Note**  
For best performance with large datasets or complex queries, we recommend testing using typical datasets.

You can often predict which distribution key and sort key CTAS chooses by examining the query plan to see which columns, if any, the query optimizer chooses for sorting and distributing data. If the top node of the query plan is a simple sequential scan from a single table (XN Seq Scan), then CTAS generally uses the source table's distribution style and sort key. If the top node of the query plan is anything other a sequential scan (such as XN Limit, XN Sort, XN HashAggregate, and so on), CTAS makes a best effort to choose the optimal distribution style and sort key based on the query plan.

For example, suppose you create five tables using the following types of SELECT clauses:
+ A simple select statement 
+ A limit clause 
+ An order by clause using LISTID 
+ An order by clause using QTYSOLD 
+ A SUM aggregate function with a group by clause.

The following examples show the query plan for each CTAS statement.

```
explain create table sales1_simple as select listid, dateid, qtysold from sales;
                           QUERY PLAN
----------------------------------------------------------------
 XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=8)
(1 row)


explain create table sales2_limit as select listid, dateid, qtysold from sales limit 100;
                              QUERY PLAN
----------------------------------------------------------------------
 XN Limit  (cost=0.00..1.00 rows=100 width=8)
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=8)
(2 rows)


explain create table sales3_orderbylistid as select listid, dateid, qtysold from sales order by listid;
                               QUERY PLAN
------------------------------------------------------------------------
 XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=8)
   Sort Key: listid
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=8)
(3 rows)


explain create table sales4_orderbyqty as select listid, dateid, qtysold from sales order by qtysold;
                               QUERY PLAN
------------------------------------------------------------------------
 XN Sort  (cost=1000000016724.67..1000000017155.81 rows=172456 width=8)
   Sort Key: qtysold
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=8)
(3 rows)


explain create table sales5_groupby as select listid, dateid, sum(qtysold) from sales group by listid, dateid;
                              QUERY PLAN
----------------------------------------------------------------------
 XN HashAggregate  (cost=3017.98..3226.75 rows=83509 width=8)
   ->  XN Seq Scan on sales  (cost=0.00..1724.56 rows=172456 width=8)
(2 rows)
```

To view the distribution key and sort key for each table, query the PG\$1TABLE\$1DEF system catalog table, as shown following. 

```
select * from pg_table_def where tablename like 'sales%';

      tablename       |   column   | distkey | sortkey
----------------------+------------+---------+---------
 sales                | salesid    | f       |       0
 sales                | listid     | t       |       0
 sales                | sellerid   | f       |       0
 sales                | buyerid    | f       |       0
 sales                | eventid    | f       |       0
 sales                | dateid     | f       |       1
 sales                | qtysold    | f       |       0
 sales                | pricepaid  | f       |       0
 sales                | commission | f       |       0
 sales                | saletime   | f       |       0
 sales1_simple        | listid     | t       |       0
 sales1_simple        | dateid     | f       |       1
 sales1_simple        | qtysold    | f       |       0
 sales2_limit         | listid     | f       |       0
 sales2_limit         | dateid     | f       |       0
 sales2_limit         | qtysold    | f       |       0
 sales3_orderbylistid | listid     | t       |       1
 sales3_orderbylistid | dateid     | f       |       0
 sales3_orderbylistid | qtysold    | f       |       0
 sales4_orderbyqty    | listid     | t       |       0
 sales4_orderbyqty    | dateid     | f       |       0
 sales4_orderbyqty    | qtysold    | f       |       1
 sales5_groupby       | listid     | f       |       0
 sales5_groupby       | dateid     | f       |       0
 sales5_groupby       | sum        | f       |       0
```

The following table summarizes the results. For simplicity, we omit cost, rows, and width details from the explain plan.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_CTAS_usage_notes.html)

You can explicitly specify distribution style and sort key in the CTAS statement. For example, the following statement creates a table using EVEN distribution and specifies SALESID as the sort key.

```
create table sales_disteven
diststyle even
sortkey (salesid)
as
select eventid, venueid, dateid, eventname
from event;
```

## Compression encoding
<a name="r_CTAS_usage_notes_encoding"></a>

ENCODE AUTO is used as the default for tables. Amazon Redshift automatically manages compression encoding for all columns in the table.

## Distribution of incoming data
<a name="r_CTAS_usage_notes-distribution-of-incoming-data"></a>

When the hash distribution scheme of the incoming data matches that of the target table, no physical distribution of the data is actually necessary when the data is loaded. For example, if a distribution key is set for the new table and the data is being inserted from another table that is distributed on the same key column, the data is loaded in place, using the same nodes and slices. However, if the source and target tables are both set to EVEN distribution, data is redistributed into the target table. 

## Automatic ANALYZE operations
<a name="r_CTAS_usage_notes-automatic-analyze-operations"></a>

Amazon Redshift automatically analyzes tables that you create with CTAS commands. You do not need to run the ANALYZE command on these tables when they are first created. If you modify them, you should analyze them in the same way as other tables. 

# CTAS examples
<a name="r_CTAS_examples"></a>

The following example creates a table called EVENT\$1BACKUP for the EVENT table:

```
create table event_backup as select * from event;
```

The resulting table inherits the distribution and sort keys from the EVENT table. 

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'event_backup';

column    | type                        | encoding | distkey | sortkey
----------+-----------------------------+----------+---------+--------
catid     | smallint                    | none     | false   |       0
dateid    | smallint                    | none     | false   |       1
eventid   | integer                     | none     | true    |       0
eventname | character varying(200)      | none     | false   |       0
starttime | timestamp without time zone | none     | false   |       0
venueid   | smallint                    | none     | false   |       0
```

The following command creates a new table called EVENTDISTSORT by selecting four columns from the EVENT table. The new table is distributed by EVENTID and sorted by EVENTID and DATEID: 

```
create table eventdistsort
distkey (1)
sortkey (1,3)
as
select eventid, venueid, dateid, eventname
from event;
```

The result is as follows:

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'eventdistsort';

column   |          type          | encoding | distkey | sortkey
---------+------------------------+----------+---------+-------
eventid   | integer               | none     | t       | 1
venueid   | smallint              | none     | f       | 0
dateid    | smallint              | none     | f       | 2
eventname | character varying(200)| none     | f       | 0
```

You could create exactly the same table by using column names for the distribution and sort keys. For example:

```
create table eventdistsort1
distkey (eventid)
sortkey (eventid, dateid)
as
select eventid, venueid, dateid, eventname
from event;
```

The following statement applies even distribution to the table but doesn't define an explicit sort key. 

```
create table eventdisteven
diststyle even
as
select eventid, venueid, dateid, eventname
from event;
```

The table doesn't inherit the sort key from the EVENT table (EVENTID) because EVEN distribution is specified for the new table. The new table has no sort key and no distribution key. 

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'eventdisteven';

column    |          type          | encoding | distkey | sortkey
----------+------------------------+----------+---------+---------
eventid   | integer                | none     | f       | 0
venueid   | smallint               | none     | f       | 0
dateid    | smallint               | none     | f       | 0
eventname | character varying(200) | none     | f       | 0
```

The following statement applies even distribution and defines a sort key: 

```
create table eventdistevensort diststyle even sortkey (venueid)
as select eventid, venueid, dateid, eventname from event;
```

 The resulting table has a sort key but no distribution key. 

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'eventdistevensort';

column    |          type          | encoding | distkey | sortkey
----------+------------------------+----------+---------+-------
eventid   | integer                | none     | f       | 0
venueid   | smallint               | none     | f       | 1
dateid    | smallint               | none     | f       | 0
eventname | character varying(200) | none     | f       | 0
```

The following statement redistributes the EVENT table on a different key column from the incoming data, which is sorted on the EVENTID column, and defines no SORTKEY column; therefore the table isn't sorted. 

```
create table venuedistevent distkey(venueid)
as select * from event;
```

The result is as follows: 

```
select "column", type, encoding, distkey, sortkey
from pg_table_def where tablename = 'venuedistevent';

 column   |            type             | encoding | distkey | sortkey
----------+-----------------------------+----------+---------+-------
eventid   | integer                     | none     | f       | 0
venueid   | smallint                    | none     | t       | 0
catid     | smallint                    | none     | f       | 0
dateid    | smallint                    | none     | f       | 0
eventname | character varying(200)      | none     | f       | 0
starttime | timestamp without time zone | none     | f       | 0
```

# CREATE TEMPLATE
<a name="r_CREATE_TEMPLATE"></a>

Creates reusable templates for Amazon Redshift commands like [COPY](r_COPY.md). Templates store commonly used parameters that can be referenced across multiple command executions, improving consistency and reducing manual parameter specification.

Templates eliminate the need to repeatedly specify the same formatting parameters across multiple operations, while source paths, target tables, and authorization may vary between operations.

## Required privileges
<a name="r_CREATE_TEMPLATE-privileges"></a>

To create a template, you must have one of the following:
+ Superuser privileges
+ CREATE permission on the schema where you want to create the template, or CREATE scoped permission on schemas in the database where you want to create the template

## Syntax
<a name="r_CREATE_TEMPLATE-synopsis"></a>

```
CREATE [ OR REPLACE ] TEMPLATE [database_name.][schema_name.]template_name
FOR COPY [ AS ]
[ [ FORMAT ] [ AS ] data_format ]
[ parameter [ argument ] [ , ... ] ];
```

## Parameters
<a name="r_CREATE_TEMPLATE-parameters"></a>

 *OR REPLACE*   
If a template of the same name already exists in the specified database and schema, the existing template is replaced. You can only replace a template with a new template that defines the same operation type, for example, COPY. You must have the necessary privileges to replace a template.

*database\$1name*  
(Optional) The name of the database where the template will be created. If not specified, the template is created in the current database.  
If the database or schema doesn't exist, the template isn't created, and the statement returns an error. You can't create templates in the system databases `template0`, `template1`, `padb_harvest` , or `sys:internal`.

*schema\$1name*  
(Optional) The name of the schema where the template will be created. If not specified, the template is created in the current schema.  
If a schema name is given, the new template is created in that schema (assuming the creator has access to the schema). The template name must be a unique name for that schema.

*template\$1name*  
The name of the template to be created. Optionally, the template name can be qualified with the database and schema name. In the following example, the database name is `demo_database`, the schema name is `demo_schema`, and the template name is `test`. For more information about valid names, see [Names and identifiers](r_names.md).  

```
CREATE TEMPLATE demo_database.demo_schema.test FOR COPY AS CSV;
```

COPY  
Specifies the Redshift command type for which the template is created. Currently, only the COPY command is supported.

[ [ FORMAT ] [ AS ] *data\$1format* ]   
This is an optional parameter. This specifies the data format for COPY operations.

[ *parameter* [ argument ]]  
Any valid parameter for the specified redshift command.  
For example, templates for the COPY command can include:  
+ [Data format parameters](copy-parameters-data-format.md)
+ [File compression parameters](copy-parameters-file-compression.md)
+ [Data conversion parameters](copy-parameters-data-conversion.md)
+ [Data load operations](copy-parameters-data-load.md)
For a complete list of supported parameters, see [COPY](r_COPY.md) command.

### Usage notes
<a name="create_template-usage-notes"></a>
+ By default, all users have CREATE and USAGE privileges on the PUBLIC schema. To disallow users from creating objects in the PUBLIC schema of a database, use the REVOKE command to remove that privilege.
+ When a parameter exists in both the template and the command, the command parameter takes precedence.
+ Templates are database objects and follow standard Redshift object naming and permission rules. For more information about valid names, see [Names and identifiers](r_names.md).
+ Templates cannot contain manifest file specifications for [COPY](r_COPY.md) command.

### Limitations
<a name="create_template-limitations"></a>
+ At least one parameter must be specified when creating a template.
+ Excluded parameters – Command-specific parameters such as source paths, target tables, authorization credentials, and manifest file specifications cannot be included in templates. These parameters must be specified in the actual command.
+ Maximum templates per cluster – You can create a maximum of 1,000 templates per cluster. This limit applies to the total number of templates across all databases and schemas in the cluster.
+ Cross-database references – Templates cannot be referenced across databases.
+ Data sharing – Templates cannot be included in data shares. Templates must be created separately in each cluster where they are needed.

## Examples
<a name="r_CREATE_TEMPLATE-examples"></a>

The following example creates a template for COPY command 

```
CREATE TEMPLATE test_schema.demo_template
FOR COPY
AS
FORMAT JSON 'auto'
NULL AS ''
MAXERROR 100;
```

Use [SHOW TEMPLATE](r_SHOW_TEMPLATE.md) to get the definition of the template:

```
SHOW TEMPLATE test_schema.demo_template;
CREATE OR REPLACE TEMPLATE dev.test_schema.demo_template FOR COPY AS FORMAT AS JSON 'auto' NULL '' MAXERROR 100;
```

 Query the [SYS\$1REDSHIFT\$1TEMPLATE](SYS_REDSHIFT_TEMPLATE.md) system view to get more details about a template. 

```
SELECT * FROM SYS_REDSHIFT_TEMPLATE;

database_name | schema_name | template_name | template_type |        create_time         |     last_modified_time     | owner_id | last_modified_by | template_parameters 
---------------+-------------+---------------+---------------+----------------------------+----------------------------+----------+------------------+---------------------
 dev           | test_schema | demo_template |             1 | 2025-12-17 20:06:01.944171 | 2025-12-17 20:06:01.944171 |        1 |                1 | {
    "JSON": "auto",
    "MAXERROR": 100,
    "NULL": ""
}
```

# CREATE USER
<a name="r_CREATE_USER"></a>

Creates a new database user. Database users can retrieve data, run commands, and perform other actions in a database, depending on their privileges and roles. You must be a database superuser to run this command.

## Required privileges
<a name="r_CREATE_USER-privileges"></a>

Following are required privileges for CREATE USER:
+ Superuser
+ Users with the CREATE USER privilege

## Syntax
<a name="r_CREATE_USER-synopsis"></a>

```
CREATE USER name [ WITH ]
PASSWORD { 'password' | 'md5hash' | 'sha256hash' | DISABLE }
[ option [ ... ] ]

where option can be:

CREATEDB | NOCREATEDB
| CREATEUSER | NOCREATEUSER
| SYSLOG ACCESS { RESTRICTED | UNRESTRICTED }
| IN GROUP groupname [, ... ]
| VALID UNTIL 'abstime'
| CONNECTION LIMIT { limit | UNLIMITED }
| SESSION TIMEOUT limit
| EXTERNALID external_id
```

## Parameters
<a name="r_CREATE_USER-parameters"></a>

 *name*   
The name of the user to create. The user name can't be `PUBLIC`. For more information about valid names, see [Names and identifiers](r_names.md).

WITH  
Optional keyword. WITH is ignored by Amazon Redshift

PASSWORD \$1 '*password*' \$1 '*md5hash*' \$1 '*sha256hash*' \$1 DISABLE \$1  
Sets the user's password.   
By default, users can change their own passwords, unless the password is disabled. To disable a user's password, specify DISABLE. When a user's password is disabled, the password is deleted from the system and the user can log on only using temporary AWS Identity and Access Management (IAM) user credentials. For more information, see [Using IAM Authentication to Generate Database User Credentials](https://docs.aws.amazon.com/redshift/latest/mgmt/generating-user-credentials.html). Only a superuser can enable or disable passwords. You can't disable a superuser's password. To enable a password, run [ALTER USER](r_ALTER_USER.md) and specify a password.  
You can specify the password in clear text, as an MD5 hash string, or as a SHA256 hash string.   
 When you launch a new cluster using the AWS Management Console, AWS CLI, or Amazon Redshift API, you must supply a clear text password for the initial database user. You can change the password later by using [ALTER USER](r_ALTER_USER.md). 
For clear text, the password must meet the following constraints:  
+ It must be 8 to 64 characters in length.
+ It must contain at least one uppercase letter, one lowercase letter, and one number.
+ It can use any ASCII characters with ASCII codes 33–126, except ' (single quotation mark), " (double quotation mark), \$1, /, or @.
As a more secure alternative to passing the CREATE USER password parameter as clear text, you can specify an MD5 hash of a string that includes the password and user name.   
When you specify an MD5 hash string, the CREATE USER command checks for a valid MD5 hash string, but it doesn't validate the password portion of the string. It is possible in this case to create a password, such as an empty string, that you can't use to log on to the database.
To specify an MD5 password, follow these steps:   

1. Concatenate the password and user name. 

   For example, for password `ez` and user `user1`, the concatenated string is `ezuser1`. 

1. Convert the concatenated string into a 32-character MD5 hash string. You can use any MD5 utility to create the hash string. The following example uses the Amazon Redshift [MD5 function](r_MD5.md) and the concatenation operator ( \$1\$1 ) to return a 32-character MD5-hash string. 

   ```
   select md5('ez' || 'user1');
                           
   md5
   --------------------------------
   153c434b4b77c89e6b94f12c5393af5b
   ```

1. Concatenate '`md5`' in front of the MD5 hash string and provide the concatenated string as the *md5hash* argument.

   ```
   create user user1 password 'md5153c434b4b77c89e6b94f12c5393af5b';
   ```

1. Log on to the database using the sign-in credentials. 

   For this example, log on as `user1` with password `ez`. 
Another secure alternative is to specify an SHA-256 hash of a password string; or you can provide your own valid SHA-256 digest and 256-bit salt that was used to create the digest.  
+ Digest – The output of a hashing function.
+ Salt – Randomly generated data that is combined with the password to help reduce patterns in the hashing function output.

```
'sha256|Mypassword'
```

```
'sha256|digest|256-bit-salt'
```
In the following example, Amazon Redshift generates and manages the salt.   

```
CREATE USER admin PASSWORD 'sha256|Mypassword1';
```
In the following example, a valid SHA-256 digest and 256-bit salt that was used to create the digest are supplied.  
To specify a password and hash it with your own salt, follow these steps:  

1. Create a 256-bit salt. You can obtain a salt by using any hexadecimal string generator to generate a string 64 characters long. For this example, the salt is `c721bff5d9042cf541ff7b9d48fa8a6e545c19a763e3710151f9513038b0f6c6`. 

1.  Use the FROM\$1HEX function to convert your salt to binary. This is because the SHA2 function requires the binary representation of the salt. See the following statement. 

   ```
   SELECT FROM_HEX('c721bff5d9042cf541ff7b9d48fa8a6e545c19a763e3710151f9513038b0f6c6');
   ```

1.  Use the CONCAT function to append your salt to your password. For this example, the password is `Mypassword1`. See the following statement. 

   ```
   SELECT CONCAT('Mypassword1',FROM_HEX('c721bff5d9042cf541ff7b9d48fa8a6e545c19a763e3710151f9513038b0f6c6'));
   ```

1. Use the SHA2 function to create a digest from your password and salt combination. See the following statement.

   ```
   SELECT SHA2(CONCAT('Mypassword1',FROM_HEX('c721bff5d9042cf541ff7b9d48fa8a6e545c19a763e3710151f9513038b0f6c6')), 0);
   ```

1.  Using the digest and salt from the previous steps, create the user. See the following statement. 

   ```
   CREATE USER admin PASSWORD 'sha256|821708135fcc42eb3afda85286dee0ed15c2c461d000291609f77eb113073ec2|c721bff5d9042cf541ff7b9d48fa8a6e545c19a763e3710151f9513038b0f6c6';
   ```

1. Log on to the database using the sign-in credentials.

    For this example, log on as `admin` with password `Mypassword1`.
If you set a password in plain text without specifying the hashing function, then an MD5 digest is generated using the username as the salt. 

CREATEDB \$1 NOCREATEDB   
The CREATEDB option allows the new user to create databases. The default is NOCREATEDB.

CREATEUSER \$1 NOCREATEUSER   
The CREATEUSER option creates a superuser with all database privileges, including CREATE USER. The default is NOCREATEUSER. For more information, see [Superusers](r_superusers.md).

SYSLOG ACCESS \$1 RESTRICTED \$1 UNRESTRICTED \$1  <a name="create-user-syslog-access"></a>
A clause that specifies the level of access the user has to the Amazon Redshift system tables and views.   
Regular users who have the SYSLOG ACCESS RESTRICTED permission can see only the rows generated by that user in user-visible system tables and views. The default is RESTRICTED.   
Regular users who have the SYSLOG ACCESS UNRESTRICTED permission can see all rows in user-visible system tables and views, including rows generated by another user. UNRESTRICTED doesn't give a regular user access to superuser-visible tables. Only superusers can see superuser-visible tables.   
Giving a user unrestricted access to system tables gives the user visibility to data generated by other users. For example, STL\$1QUERY and STL\$1QUERYTEXT contain the full text of INSERT, UPDATE, and DELETE statements, which might contain sensitive user-generated data. 
All rows in SVV\$1TRANSACTIONS are visible to all users.   
For more information, see [Visibility of data in system tables and views](cm_chap_system-tables.md#c_visibility-of-data).

IN GROUP *groupname*   
Specifies the name of an existing group that the user belongs to. Multiple group names may be listed.

VALID UNTIL *abstime*   
The VALID UNTIL option sets an absolute time after which the user's password is no longer valid. By default the password has no time limit.

CONNECTION LIMIT \$1 *limit* \$1 UNLIMITED \$1   
The maximum number of database connections the user is permitted to have open concurrently. The limit isn't enforced for superusers. Use the UNLIMITED keyword to permit the maximum number of concurrent connections. A limit on the number of connections for each database might also apply. For more information, see [CREATE DATABASE](r_CREATE_DATABASE.md). The default is UNLIMITED. To view current connections, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view.  
If both user and database connection limits apply, an unused connection slot must be available that is within both limits when a user attempts to connect.

SESSION TIMEOUT *limit*  
The maximum time in seconds that a session remains inactive or idle. The range is 60 seconds (one minute) to 1,728,000 seconds (20 days). If no session timeout is set for the user, the cluster setting applies. For more information, see [ Quotas and limits in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html) in the *Amazon Redshift Management Guide*.  
When you set the session timeout, it's applied to new sessions only.  
To view information about active user sessions, including the start time, user name, and session timeout, query the [STV\$1SESSIONS](r_STV_SESSIONS.md) system view. To view information about user-session history, query the [STL\$1SESSIONS](r_STL_SESSIONS.md) view. To retrieve information about database users, including session-timeout values, query the [SVL\$1USER\$1INFO](r_SVL_USER_INFO.md) view.

EXTERNALID *external\$1id*  
The identifier for the user, which is associated with an identity provider. The user must have their password disabled. For more information, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html).

### Usage notes
<a name="create_user-usage-notes"></a>

By default, all users have CREATE and USAGE privileges on the PUBLIC schema. To disallow users from creating objects in the PUBLIC schema of a database, use the REVOKE command to remove that privilege.

When using IAM authentication to create database user credentials, you might want to create a superuser that is able to log on only using temporary credentials. You can't disable a superuser's password, but you can create an unknown password using a randomly generated MD5 hash string.

```
create user iam_superuser password 'md5A1234567890123456780123456789012' createuser;
```

The case of a *username* enclosed in double quotation marks is always preserved regardless of the setting of the `enable_case_sensitive_identifier` configuration option. For more information, see [enable\$1case\$1sensitive\$1identifier](r_enable_case_sensitive_identifier.md).

## Examples
<a name="r_CREATE_USER-examples"></a>

The following command creates a user named dbuser, with the password "abcD1234", database creation privileges, and a connection limit of 30.

```
create user dbuser with password 'abcD1234' createdb connection limit 30;
```

 Query the PG\$1USER\$1INFO catalog table to view details about a database user. 

```
select * from pg_user_info;
         
 usename   | usesysid | usecreatedb | usesuper | usecatupd | passwd   | valuntil | useconfig | useconnlimit
-----------+----------+-------------+----------+-----------+----------+----------+-----------+-------------
 rdsdb     |        1 | true        | true     | true      | ******** | infinity |           |
 adminuser |      100 | true        | true     | false     | ******** |          |           | UNLIMITED
 dbuser    |      102 | true        | false    | false     | ******** |          |           | 30
```

In the following example, the account password is valid until June 10, 2017.

```
create user dbuser with password 'abcD1234' valid until '2017-06-10';
```

 The following example creates a user with a case-sensitive password that contains special characters.

```
create user newman with password '@AbC4321!';
```

 To use a backslash ('\$1') in your MD5 password, escape the backslash with a backslash in your source string. The following example creates a user named `slashpass` with a single backslash ( '`\`') as the password. 

```
select md5('\\'||'slashpass');
         
md5
--------------------------------
0c983d1a624280812631c5389e60d48c
```

Create a user with the md5 password.

```
create user slashpass password 'md50c983d1a624280812631c5389e60d48c';
```

The following example creates a user named `dbuser` with an idle-session timeout set to 120 seconds.

```
CREATE USER dbuser password 'abcD1234' SESSION TIMEOUT 120;
```

The following example creates a user named `bob`. The namespace is `myco_aad`. This is only a sample. To run the command successfully, you must have a registered identity provider. For more information, see [Native identity provider (IdP) federation for Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-native-idp.html).

```
CREATE USER myco_aad:bob EXTERNALID "ABC123" PASSWORD DISABLE;
```

# CREATE VIEW
<a name="r_CREATE_VIEW"></a>

Creates a view in a database. The view isn't physically materialized; the query that defines the view is run every time the view is referenced in a query. To create a view with an external table, include the WITH NO SCHEMA BINDING clause.

To create a standard view, you need access to the underlying tables, or to underlying views. To query a standard view, you need select permissions for the view itself, but you don't need select permissions for the underlying tables. In a case where you create a view that references a table or view in another schema, or if you create a view that references a materialized view, you need usage permissions. To query a late binding view, you need select permissions for the late binding view itself. You should also make sure the owner of the late binding view has select permissions for the referenced objects (tables, views, or user-defined functions). For more information about late-binding views, see [Usage notes](#r_CREATE_VIEW_usage_notes).

## Required permissions
<a name="r_CREATE_VIEW-privileges"></a>

To use CREATE VIEW, one of the following permissions is required.
+ To create a view using CREATE [ OR REPLACE ] VIEW:
  + Superuser
  + Users with the CREATE [ REPLACE ] VIEW permission
+ To replace an existing view using CREATE OR REPLACE VIEW:
  + Superuser
  + Users with the CREATE [ OR REPLACE ] VIEW permission
  + View owner

If a user wants to access a view that incorporates a user-defined function, the user must have the EXECUTE permission on that function.

## Syntax
<a name="r_CREATE_VIEW-synopsis"></a>

```
CREATE [ OR REPLACE ] VIEW name [ ( column_name [, ...] ) ] AS query
[ WITH NO SCHEMA BINDING ]
```

## Parameters
<a name="r_CREATE_VIEW-parameters"></a>

OR REPLACE   
If a view of the same name already exists, the view is replaced. You can only replace a view with a new query that generates the identical set of columns, using the same column names and data types. CREATE OR REPLACE VIEW locks the view for reads and writes until the operation completes.  
When a view is replaced, its other properties such as ownership and granted privileges are preserved. 

 *name*   
The name of the view. If a schema name is given (such as `myschema.myview`) the view is created using the specified schema. Otherwise, the view is created in the current schema. The view name must be different from the name of any other view or table in the same schema.   
If you specify a view name that begins with '\$1 ', the view is created as a temporary view that is visible only in the current session.  
For more information about valid names, see [Names and identifiers](r_names.md). You can't create tables or views in the system databases template0, template1, padb\$1harvest, or sys:internal.

 *column\$1name*   
Optional list of names to be used for the columns in the view. If no column names are given, the column names are derived from the query. The maximum number of columns you can define in a single view is 1,600.

 *query*   
A query (in the form of a SELECT statement) that evaluates to a table. This table defines the columns and rows in the view. 

 WITH NO SCHEMA BINDING   
Clause that specifies that the view isn't bound to the underlying database objects, such as tables and user-defined functions. As a result, there is no dependency between the view and the objects it references. You can create a view even if the referenced objects don't exist. Because there is no dependency, you can drop or alter a referenced object without affecting the view. Amazon Redshift doesn't check for dependencies until the view is queried. Recursive common table expressions (rCTE) are not supported with late-binding views. To view details about late-binding views, run the [PG\$1GET\$1LATE\$1BINDING\$1VIEW\$1COLS](PG_GET_LATE_BINDING_VIEW_COLS.md) function.  
When you include the WITH NO SCHEMA BINDING clause, tables and views referenced in the SELECT statement must be qualified with a schema name. The schema must exist when the view is created, even if the referenced table doesn't exist. For example, the following statement returns an error.   

```
create view myevent as select eventname from event
with no schema binding;
```
The following statement runs successfully.  

```
create view myevent as select eventname from public.event
with no schema binding;
```

**Note**  
You can't update, insert into, or delete from a view. 

## Usage notes
<a name="r_CREATE_VIEW_usage_notes"></a>


### Late-binding views
<a name="r_CREATE_VIEW_late-binding-views"></a>

A late-binding view doesn't check the underlying database objects, such as tables and other views, until the view is queried. As a result, you can alter or drop the underlying objects without dropping and recreating the view. If you drop underlying objects, queries to the late-binding view will fail. If the query to the late-binding view references columns in the underlying object that aren't present, the query will fail. 

 If you drop and then re-create a late-binding view's underlying table or view, the new object is created with default access permissions. You might need to grant permissions to the underlying objects for users who will query the view. 

To create a late-binding view, include the WITH NO SCHEMA BINDING clause. The following example creates a view with no schema binding. 

```
create view event_vw as select * from public.event
with no schema binding;
```

```
select * from event_vw limit 1;
            
eventid | venueid | catid | dateid | eventname     | starttime
--------+---------+-------+--------+---------------+--------------------
      2 |     306 |     8 |   2114 | Boris Godunov | 2008-10-15 20:00:00
```

The following example shows that you can alter an underlying table without recreating the view. 

```
alter table event rename column eventname to title;
```

```
select * from event_vw limit 1;
            
eventid | venueid | catid | dateid | title         | starttime
--------+---------+-------+--------+---------------+--------------------
      2 |     306 |     8 |   2114 | Boris Godunov | 2008-10-15 20:00:00
```

You can reference Amazon Redshift Spectrum external tables only in a late-binding view. One application of late-binding views is to query both Amazon Redshift and Redshift Spectrum tables. For example, you can use the [UNLOAD](r_UNLOAD.md) command to archive older data to Amazon S3. Then, create a Redshift Spectrum external table that references the data on Amazon S3 and create a view that queries both tables. The following example uses a UNION ALL clause to join the Amazon Redshift `SALES` table and the Redshift Spectrum `SPECTRUM.SALES` table.

```
create view sales_vw as
select * from public.sales
union all
select * from spectrum.sales
with no schema binding;
```

For more information about creating Redshift Spectrum external tables, including the `SPECTRUM.SALES` table, see [Getting started with Amazon Redshift Spectrum](c-getting-started-using-spectrum.md).

**Important**  
When you create a standard view from a late-binding view, the standard view’s definition contains the definition of the late-binding view at the time the standard view was made, including the owner of the late-binding view. If you make a change in the underlying late-binding view, those changes will not be used in the standard view until you re-create the standard view. Therefore, when the standard view is queried, it will always use the late-binding view's definition and the late-binding view's owner for permission checking at the time of that standard view creation.

To update the standard view to refer to the latest definition of the late-binding view, run CREATE OR REPLACE VIEW with the initial view definition you used to create the standard view.

See the following example of creating a standard view from a late-binding view.

```
create view sales_vw_lbv as 
select * from public.sales 
with no schema binding;

show view sales_vw_lbv;
                            Show View DDL statement
--------------------------------------------------------------------------------
 create view sales_vw_lbv as select * from public.sales with no schema binding;
(1 row)

create view sales_vw as 
select * from sales_vw_lbv;

show view sales_vw;
                                               Show View DDL statement
---------------------------------------------------------------------------------------------------------------------
 SELECT sales_vw_lbv.price, sales_vw_lbv."region" FROM (SELECT sales.price, sales."region" FROM sales) sales_vw_lbv;
(1 row)
```

Note that the late-binding view as shown in the DDL statement for the standard view is defined when the standard view is created, and won’t update with any changes you make to the late-binding view afterward.

## Examples
<a name="r_CREATE_VIEW-examples"></a>

The example commands use a sample set of objects and data called the *TICKIT* database. For more information, see [Sample database](https://docs.aws.amazon.com/redshift/latest/dg/c_sampledb.html).

The following command creates a view called *myevent* from a table called EVENT. 

```
create view myevent as select eventname from event
where eventname = 'LeAnn Rimes';
```

The following command creates a view called* myuser* from a table called USERS. 

```
create view myuser as select lastname from users;
```

The following command creates or replaces a view called* myuser* from a table called USERS. 

```
create or replace view myuser as select lastname from users;
```

The following example creates a view with no schema binding. 

```
create view myevent as select eventname from public.event
with no schema binding;
```

# DEALLOCATE
<a name="r_DEALLOCATE"></a>

Deallocates a prepared statement. 

## Syntax
<a name="r_DEALLOCATE-synopsis"></a>

```
DEALLOCATE [PREPARE] plan_name
```

## Parameters
<a name="r_DEALLOCATE-parameters"></a>

PREPARE   
This keyword is optional and is ignored. 

 *plan\$1name*   
The name of the prepared statement to deallocate. 

## Usage Notes
<a name="r_DEALLOCATE_usage_notes"></a>

DEALLOCATE is used to deallocate a previously prepared SQL statement. If you don't explicitly deallocate a prepared statement, it is deallocated when the current session ends. For more information on prepared statements, see [PREPARE](r_PREPARE.md).

## See Also
<a name="r_DEALLOCATE-see-also"></a>

 [EXECUTE](r_EXECUTE.md), [PREPARE](r_PREPARE.md) 

# DECLARE
<a name="declare"></a>

Defines a new cursor. Use a cursor to retrieve a few rows at a time from the result set of a larger query. 

When the first row of a cursor is fetched, the entire result set is materialized on the leader node, in memory or on disk, if needed. Because of the potential negative performance impact of using cursors with large result sets, we recommend using alternative approaches whenever possible. For more information, see [Performance considerations when using cursors](#declare-performance).

You must declare a cursor within a transaction block. Only one cursor at a time can be open per session. 

For more information, see [FETCH](fetch.md), [CLOSE](close.md).

## Syntax
<a name="declare-synopsis"></a>

```
DECLARE cursor_name CURSOR FOR query
```

## Parameters
<a name="declare-parameters"></a>

*cursor\$1name*   
Name of the new cursor. 

 *query*   
A SELECT statement that populates the cursor.

## DECLARE CURSOR usage notes
<a name="declare-usage"></a>

If your client application uses an ODBC connection and your query creates a result set that is too large to fit in memory, you can stream the result set to your client application by using a cursor. When you use a cursor, the entire result set is materialized on the leader node, and then your client can fetch the results incrementally. 

**Note**  
To enable cursors in ODBC for Microsoft Windows, enable the **Use Declare/Fetch** option in the ODBC DSN you use for Amazon Redshift. We recommend setting the ODBC cache size, using the **Cache Size** field in the ODBC DSN options dialog, to 4,000 or greater on multi-node clusters to minimize round trips. On a single-node cluster, set Cache Size to 1,000.

Because of the potential negative performance impact of using cursors, we recommend using alternative approaches whenever possible. For more information, see [Performance considerations when using cursors](#declare-performance).

Amazon Redshift cursors are supported with the following limitations:
+ Only one cursor at a time can be open per session. 
+ Cursors must be used within a transaction (BEGIN … END). 
+ The maximum cumulative result set size for all cursors is constrained based on the cluster node type. If you need larger result sets, you can resize to an XL or 8XL node configuration.

  For more information, see [Cursor constraints](#declare-constraints). 

## Cursor constraints
<a name="declare-constraints"></a>

When the first row of a cursor is fetched, the entire result set is materialized on the leader node. If the result set doesn't fit in memory, it is written to disk as needed. To protect the integrity of the leader node, Amazon Redshift enforces constraints on the size of all cursor result sets, based on the cluster's node type.

The following table shows the maximum total result set size for each cluster node type. Maximum result set sizes are in megabytes.


| Node type | Maximum result set per cluster (MB) | 
| --- | --- | 
|   DC2 Large multiple nodes   | 192,000 | 
|   DC2 Large single node   | 8,000 | 
|   DC2 8XL multiple nodes   | 3,200,000 | 
|   RA3 16XL multiple nodes   | 14,400,000 | 
|   RA3 4XL multiple nodes   | 3,200,000 | 
|   RA3 XLPLUS multiple nodes   | 1,000,000 | 
|   RA3 XLPLUS single node   | 64,000 | 
|   RA3 LARGE multiple nodes   | 240,000 | 
|   RA3 LARGE single node   | 8,000 | 
| Amazon Redshift Serverless | 150,000 | 

To view the active cursor configuration for a cluster, query the [STV\$1CURSOR\$1CONFIGURATION](r_STV_CURSOR_CONFIGURATION.md) system table as a superuser. To view the state of active cursors, query the [STV\$1ACTIVE\$1CURSORS](r_STV_ACTIVE_CURSORS.md) system table. Only the rows for a user's own cursors are visible to the user, but a superuser can view all cursors.

## Performance considerations when using cursors
<a name="declare-performance"></a>

Because cursors materialize the entire result set on the leader node before beginning to return results to the client, using cursors with very large result sets can have a negative impact on performance. We strongly recommend against using cursors with very large result sets. In some cases, such as when your application uses an ODBC connection, cursors might be the only feasible solution. If possible, we recommend using these alternatives:
+ Use [UNLOAD](r_UNLOAD.md) to export a large table. When you use UNLOAD, the compute nodes work in parallel to transfer the data directly to data files on Amazon Simple Storage Service. For more information, see [Unloading data in Amazon Redshift](c_unloading_data.md). 
+ Set the JDBC fetch size parameter in your client application. If you use a JDBC connection and you are encountering client-side out-of-memory errors, you can enable your client to retrieve result sets in smaller batches by setting the JDBC fetch size parameter. For more information, see [Setting the JDBC fetch size parameter](set-the-JDBC-fetch-size-parameter.md). 

## DECLARE CURSOR examples
<a name="declare-example"></a>

The following example declares a cursor named LOLLAPALOOZA to select sales information for the Lollapalooza event, and then fetches rows from the result set using the cursor:

```
-- Begin a transaction

begin;

-- Declare a cursor

declare lollapalooza cursor for
select eventname, starttime, pricepaid/qtysold as costperticket, qtysold
from sales, event
where sales.eventid = event.eventid
and eventname='Lollapalooza';

-- Fetch the first 5 rows in the cursor lollapalooza:

fetch forward 5 from lollapalooza;

  eventname   |      starttime      | costperticket | qtysold
--------------+---------------------+---------------+---------
 Lollapalooza | 2008-05-01 19:00:00 |   92.00000000 |       3
 Lollapalooza | 2008-11-15 15:00:00 |  222.00000000 |       2
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       3
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       4
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       1
(5 rows)

-- Fetch the next row:

fetch next from lollapalooza;

  eventname   |      starttime      | costperticket | qtysold
--------------+---------------------+---------------+---------
 Lollapalooza | 2008-10-06 14:00:00 |  114.00000000 |       2

-- Close the cursor and end the transaction:

close lollapalooza;
commit;
```

The following example loops over a refcursor with all the results from a table:

```
CREATE TABLE tbl_1 (a int, b int);
INSERT INTO tbl_1 values (1, 2),(3, 4);

CREATE OR REPLACE PROCEDURE sp_cursor_loop() AS $$
DECLARE
    target record;
    curs1 cursor for select * from tbl_1;
BEGIN
    OPEN curs1;
    LOOP
        fetch curs1 into target;
        exit when not found;
        RAISE INFO 'a %', target.a;
    END LOOP;
    CLOSE curs1;
END;
$$ LANGUAGE plpgsql;

CALL sp_cursor_loop();
         
SELECT message 
   from svl_stored_proc_messages 
   where querytxt like 'CALL sp_cursor_loop()%';
         
  message
----------
      a 1
      a 3
```

# DELETE
<a name="r_DELETE"></a>

Deletes rows from tables.

**Note**  
The maximum size for a single SQL statement is 16 MB.

## Syntax
<a name="r_DELETE-synopsis"></a>

```
[ WITH [RECURSIVE] common_table_expression [, common_table_expression , ...] ]
DELETE [ FROM ] { table_name | materialized_view_name }
    [ { USING } table_name, ... ]
    [ WHERE condition ]
```

## Parameters
<a name="r_DELETE-parameters"></a>

WITH clause  
Optional clause that specifies one or more *common-table-expressions*. See [WITH clause](r_WITH_clause.md). 

FROM  
The FROM keyword is optional, except when the USING clause is specified. The statements `delete from event;` and `delete event;` are equivalent operations that remove all of the rows from the EVENT table.  
To delete all the rows from a table, [TRUNCATE](r_TRUNCATE.md) the table. TRUNCATE is much more efficient than DELETE and doesn't require a VACUUM and ANALYZE. However, be aware that TRUNCATE commits the transaction in which it is run.

 *table\$1name*   
A temporary or persistent table. Only the owner of the table or a user with DELETE privilege on the table may delete rows from the table.  
Consider using the TRUNCATE command for fast unqualified delete operations on large tables; see [TRUNCATE](r_TRUNCATE.md).  
After deleting a large number of rows from a table:  
+ Vacuum the table to reclaim storage space and re-sort rows.
+ Analyze the table to update statistics for the query planner.

 *materialized\$1view\$1name*   
A materialized view. The DELETE statement works on a materialized view used for [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md). Only the owner of the materialized view or a user with DELETE privilege on the materialized view may delete rows from it.  
You can't run DELETE on a materialized view for streaming ingestion with a row-level security (RLS) policy that doesn't have the IGNORE RLS permission granted to the user. There is an exception to this: If the user performing the DELETE has IGNORE RLS granted, it runs successfully. For more information, see [RLS policy ownership and management](https://docs.aws.amazon.com/redshift/latest/dg/t_rls_ownership.html).

USING *table\$1name*, ...  
The USING keyword is used to introduce a table list when additional tables are referenced in the WHERE clause condition. For example, the following statement deletes all of the rows from the EVENT table that satisfy the join condition over the EVENT and SALES tables. The SALES table must be explicitly named in the FROM list:  

```
delete from event using sales where event.eventid=sales.eventid;
```
If you repeat the target table name in the USING clause, the DELETE operation runs a self-join. You can use a subquery in the WHERE clause instead of the USING syntax as an alternative way to write the same query.

WHERE *condition*   
Optional clause that limits the deletion of rows to those that match the condition. For example, the condition can be a restriction on a column, a join condition, or a condition based on the result of a query. The query can reference tables other than the target of the DELETE command. For example:  

```
delete from t1
where col1 in(select col2 from t2);
```
If no condition is specified, all of the rows in the table are deleted.

## Usage notes
<a name="r_DELETE-usage"></a>
+ DELETE operations hold exclusive locks when run on Amazon Redshift streaming materialized views connected to any of the following:
  +  An Amazon Kinesis data stream 
  +  An Amazon Managed Streaming for Apache Kafka topic 
  +  A supported external stream, such as a Confluent Cloud Kafka topic 

  For more information, see [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

## Examples
<a name="r_DELETE-examples"></a>

Delete all of the rows from the CATEGORY table:

```
delete from category;
```

Delete rows with CATID values between 0 and 9 from the CATEGORY table:

```
delete from category
where catid between 0 and 9;
```

Delete rows from the LISTING table whose SELLERID values don't exist in the SALES table:

```
delete from listing
where listing.sellerid not in(select sales.sellerid from sales);
```

The following two queries both delete one row from the CATEGORY table, based on a join to the EVENT table and an additional restriction on the CATID column:

```
delete from category
using event
where event.catid=category.catid and category.catid=9;
```

```
delete from category
where catid in
(select category.catid from category, event
where category.catid=event.catid and category.catid=9);
```

The following query deletes all rows from the `mv_cities` materialized view. The materialized view name in this example is a sample:

```
delete from mv_cities;
```

# DESC DATASHARE
<a name="r_DESC_DATASHARE"></a>

Displays a list of the database objects within a datashare that are added to it using ALTER DATASHARE. Amazon Redshift displays the names, databases, schemas, and types of tables, views, and functions. 

Additional information about datashare objects can be found by using system views. For more information, see [SVV\$1DATASHARE\$1OBJECTS](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_DATASHARE_OBJECTS.html) and [SVV\$1DATASHARES](https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_DATASHARES.html).

## Syntax
<a name="r_DESC_DATASHARE-synopsis"></a>

```
DESC DATASHARE datashare_name [ OF [ ACCOUNT account_id ] NAMESPACE namespace_guid ]
```

## Parameters
<a name="r_DESC_DATASHARE-parameters"></a>

 *datashare\$1name*   
The name of the datashare . 

NAMESPACE *namespace\$1guid*   
A value that specifies the namespace that the datashare uses. When you run DESC DATAHSARE as a consumer cluster administrator, specify the NAMESPACE parameter to view inbound datashares.

ACCOUNT *account\$1id*  
A value that specifies the account that the datashare belongs to.

## Usage Notes
<a name="r_DESC_DATASHARE-usage"></a>

As a consumer account administrator, when you run DESC DATASHARE to see inbound datashares within the AWS account, specify the NAMESPACE option. When you run DESC DATASHARE to see inbound datashares across AWS accounts, specify the ACCOUNT and NAMESPACE options.

## Examples
<a name="r_DESC_DATASHARE-examples"></a>

The following example displays the information for outbound datashares on a producer cluster.

```
DESC DATASHARE salesshare;

producer_account |          producer_namespace           | share_type  | share_name   | object_type |        object_name           |  include_new
-----------------+---------------------------------------+-------------+--------------+-------------+------------------------------+--------------
 123456789012    | 13b8833d-17c6-4f16-8fe4-1a018f5ed00d  | OUTBOUND    |  salesshare  | TABLE       | public.tickit_sales_redshift |
 123456789012    | 13b8833d-17c6-4f16-8fe4-1a018f5ed00d  | OUTBOUND    |  salesshare  | SCHEMA      | public                       |   t
```

The following example displays the information for inbound datashares on a consumer cluster.

```
DESC DATASHARE salesshare of ACCOUNT '123456789012' NAMESPACE '13b8833d-17c6-4f16-8fe4-1a018f5ed00d';

 producer_account |          producer_namespace          | share_type | share_name | object_type |         object_name          |  include_new
------------------+--------------------------------------+------------+------------+-------------+------------------------------+--------------
 123456789012     | 13b8833d-17c6-4f16-8fe4-1a018f5ed00d | INBOUND    | salesshare | table       | public.tickit_sales_redshift |
 123456789012     | 13b8833d-17c6-4f16-8fe4-1a018f5ed00d | INBOUND    | salesshare | schema      | public                       |
(2 rows)
```

# DESC IDENTITY PROVIDER
<a name="r_DESC_IDENTITY_PROVIDER"></a>

Displays information about an identity provider. Only a superuser can describe an identity provider.

## Syntax
<a name="r_DESC_IDENTITY_PROVIDER-synopsis"></a>

```
DESC IDENTITY PROVIDER identity_provider_name
```

## Parameters
<a name="r_DESC_IDENTITY_PROVIDER-parameters"></a>

 *identity\$1provider\$1name*   
The name of the identity provider.

## Example
<a name="r_DESC_IDENTITY_PROVIDER-examples"></a>

The following example displays information about the identity provider.

```
DESC IDENTITY PROVIDER azure_idp;
```

Sample output.

```
  uid   |   name    | type  |              instanceid              | namespc |                                                                                                                                                 params                                                                                                                                                  | enabled
--------+-----------+-------+--------------------------------------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------
 126692 | azure_idp | azure | e40d4bb2-7670-44ae-bfb8-5db013221d73 | aad     | {"issuer":"https://login.microsoftonline.com/e40d4bb2-7670-44ae-bfb8-5db013221d73/v2.0", "client_id":"871c010f-5e61-4fb1-83ac-98610a7e9110", "client_secret":'', "audience":["https://analysis.windows.net/powerbi/connector/AmazonRedshift", "https://analysis.windows.net/powerbi/connector/AWSRDS"]} | t
(1 row)
```

# DETACH MASKING POLICY
<a name="r_DETACH_MASKING_POLICY"></a>

Detaches an already attached dynamic data masking policy from a column. For more information on dynamic data masking, see [Dynamic data masking](t_ddm.md).

Superusers and users or roles that have the sys:secadmin role can detach a masking policy.

## Syntax
<a name="r_DETACH_MASKING_POLICY-synopsis"></a>

```
DETACH MASKING POLICY
{
  policy_name ON table_name
  | database_name.policy_name ON database_name.schema_name.table_name
}
( output_column_names )
FROM { user_name | ROLE role_name | PUBLIC };
```

## Parameters
<a name="r_DETACH_MASKING_POLICY-parameters"></a>

 *policy\$1name*   
The name of the masking policy to detach.

database\$1name  
The name of the database where the policy and the relation are created. The policy and the relation needs to be on the same database. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

schema\$1name  
The name of the schema the relation belongs to.

 *table\$1name*   
The name of the table to detach the masking policy from.

*output\$1column\$1names*   
The names of the columns to which the masking policy was attached.

*user\$1name*   
The name of the user to whom the masking policy was attached.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single DETACH MASKING POLICY statement.

*role\$1name*   
The name of the role to which the masking policy was attached.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single DETACH MASKING POLICY statement.

*PUBLIC*   
Shows that the policy was attached to all users in the table.  
You can only set one of user\$1name, role\$1name, and PUBLIC in a single DETACH MASKING POLICY statement.

For the usage of DETACH MASKING POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

# DETACH RLS POLICY
<a name="r_DETACH_RLS_POLICY"></a>

Detach a row-level security policy on a table from one or more users or roles.

Superusers and users or roles that have the `sys:secadmin` role can detach a policy.

## Syntax
<a name="r_DETACH_RLS_POLICY-synopsis"></a>

```
DETACH RLS POLICY
{
  policy_name ON [TABLE] table_name [, ...]
  | database_name.policy_name ON [TABLE] database_name.schema_name.table_name [, ...]
}
FROM { user_name | ROLE role_name | PUBLIC } [, ...];
```

## Parameters
<a name="r_DETACH_RLS_POLICY-parameters"></a>

 *policy\$1name*   
The name of the policy.

database\$1name  
The name of the database where the policy and the relation are created. The policy and the relation needs to be on the same database. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

schema\$1name  
The name of the schema the relation belongs to.

table\$1name  
The relation that the row-level security policy is attached to.

FROM \$1 *user\$1name* \$1 ROLE *role\$1name* \$1 PUBLIC\$1 [, ...]  
Specifies whether the policy is detached from one or more specified users or roles. 

For the usage of DETACH RLS POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

## Usage notes
<a name="r_DETACH_RLS_POLICY-usage"></a>

When working with the DETACH RLS POLICY statement, observe the following:
+ You can detach a policy from a relation, user, role, or public.

## Examples
<a name="r_DETACH_RLS_POLICY-examples"></a>

The following example detaches a policy on a table from a role.

```
DETACH RLS POLICY policy_concerts ON tickit_category_redshift FROM ROLE analyst, ROLE dbadmin;
```

# DROP DATABASE
<a name="r_DROP_DATABASE"></a>

Drops a database. 

You can't run DROP DATABASE within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

## Syntax
<a name="r_DROP_DATABASE-synopsis"></a>

```
DROP DATABASE database_name [ FORCE ]
```

## Parameters
<a name="r_DROP_DATABASE-parameters"></a>

 *database\$1name*   
Name of the database to be dropped. You can't drop the dev, padb\$1harvest, template0, template1, or sys:internal databases, and you can't drop the current database.  
To drop an external database, drop the external schema. For more information, see [DROP SCHEMA](r_DROP_SCHEMA.md).

 FORCE   
When you specify FORCE, DROP DATABASE attempts to terminate active connections prior to dropping the database. If all active connections successfully terminate within a timeout, the drop proceeds. If not all connections terminate, the command throws an error.

## DROP DATABASE usage notes
<a name="r_DROP_DATABASE_usage"></a>

When using the DROP DATABASE statement, consider the following:
+ In general, we recommend that you don't drop a database that contains an AWS Data Exchange datashare using the DROP DATABASE statement. If you do, the AWS accounts that have access to the datashare lose access. Performing this type of alteration can breach data product terms in AWS Data Exchange.

  The following example shows an error when a database that contains an AWS Data Exchange datashare is dropped.

  ```
  DROP DATABASE test_db;
  ERROR:   Drop of database test_db that contains ADX-managed datashare(s) requires session variable datashare_break_glass_session_var to be set to value 'ce8d280c10ad41'
  ```

  To allow dropping the database, set the following variable and run the DROP DATABASE statement again.

  ```
  SET datashare_break_glass_session_var to 'ce8d280c10ad41';
  ```

  ```
  DROP DATABASE test_db;
  ```

  In this case, Amazon Redshift generates a random one-time value to set the session variable to allow DROP DATABASE for a database that contains an AWS Data Exchange datashare.

## Examples
<a name="r_DROP_DATABASE-examples"></a>

The following example drops a database named TICKIT\$1TEST: 

```
drop database tickit_test;
```

# DROP DATASHARE
<a name="r_DROP_DATASHARE"></a>

Drops a datashare. This command isn't reversible.

Only a superuser or the datashare owner can drop a datashare.

## Required privileges
<a name="r_DROP_DATASHARE-privileges"></a>

Following are required privileges for DROP DATASHARE:
+ Superuser
+ Users with the DROP DATASHARE privilege
+ Datashare owner

## Syntax
<a name="r_DROP_DATASHARE-synopsis"></a>

```
DROP DATASHARE datashare_name;
```

## Parameters
<a name="r_DROP_DATASHARE-parameters"></a>

 *datashare\$1name*   
The name of the datashare to be dropped.

## DROP DATASHARE usage notes
<a name="r_DROP_DATASHARE_usage"></a>

When using the DROP DATASHARE statement, consider the following:
+ In general, we recommend that you don't drop an AWS Data Exchange datashare using the DROP DATASHARE statement. If you do, the AWS accounts that have access to the datashare lose access. Performing this type of alteration can breach data product terms in AWS Data Exchange.

  The following example shows an error when an AWS Data Exchange datashare is dropped.

  ```
  DROP DATASHARE salesshare;
  ERROR:  Drop of ADX-managed datashare salesshare requires session variable datashare_break_glass_session_var to be set to value '620c871f890c49'
  ```

  To allow dropping an AWS Data Exchange datashare, set the following variable and run the DROP DATASHARE statement again.

  ```
  SET datashare_break_glass_session_var to '620c871f890c49';
  ```

  ```
  DROP DATASHARE salesshare;
  ```

  In this case, Amazon Redshift generates a random one-time value to set the session variable to allow DROP DATASHARE for an AWS Data Exchange datashare.

## Examples
<a name="r_DROP_DATASHARE-examples"></a>

The following example drops a datashare named `salesshare`.

```
DROP DATASHARE salesshare;
```

# DROP EXTERNAL VIEW
<a name="r_DROP_EXTERNAL_VIEW"></a>

Drops an external view from the database. Dropping an external view removes it from all SQL engines the view is associated with, such as Amazon Athena and Amazon EMR Spark. This command can't be reversed. For more information about Data Catalog views, see [AWS Glue Data Catalog views](https://docs.aws.amazon.com/redshift/latest/dg/data-catalog-views-overview.html).

## Syntax
<a name="r_DROP_EXTERNAL_VIEW-synopsis"></a>

```
DROP EXTERNAL VIEW schema_name.view_name [ IF EXISTS ]
{catalog_name.schema_name.view_name | awsdatacatalog.dbname.view_name | external_schema_name.view_name}
```

## Parameters
<a name="r_DROP_EXTERNAL_VIEW-parameters"></a>

 *schema\$1name.view\$1name*   
The schema that’s attached to your AWS Glue database, followed by the name of the view.

IF EXISTS  
Drops the view only if it exists.

catalog\$1name.schema\$1name.view\$1name \$1 awsdatacatalog.dbname.view\$1name \$1 external\$1schema\$1name.view\$1name  
The notation of the schema to use when dropping the view. You can specify to use the AWS Glue Data Catalog, a Glue database that you created, or an external schema that you created. See [CREATE DATABASE](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_DATABASE.html) and [CREATE EXTERNAL SCHEMA ](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_EXTERNAL_SCHEMA.html) for more information.

 *query\$1definition*   
The definition of the SQL query that Amazon Redshift runs to alter the view.

## Examples
<a name="r_DROP_EXTERNAL_VIEW-examples"></a>

The following example drops a Data Catalog view named sample\$1schema.glue\$1data\$1catalog\$1view.

```
DROP EXTERNAL VIEW sample_schema.glue_data_catalog_view IF EXISTS
```

# DROP FUNCTION
<a name="r_DROP_FUNCTION"></a>

Removes a user-defined function (UDF) from the database. The function's signature, or list of argument data types, must be specified because multiple functions can exist with the same name but different signatures. You can't drop an Amazon Redshift built-in function.

This command isn't reversible.

## Required privileges
<a name="r_DROP_FUNCTION-privileges"></a>

Following are required privileges for DROP FUNCTION:
+ Superuser
+ Users with the DROP FUNCTION privilege
+ Function owner

## Syntax
<a name="r_DROP_FUNCTION-synopsis"></a>

```
DROP FUNCTION name
( [arg_name] arg_type   [, ...] )
[ CASCADE | RESTRICT ]
```

## Parameters
<a name="r_DROP_FUNCTION-parameters"></a>

 *name*   
The name of the function to be removed.

 *arg\$1name*   
The name of an input argument. DROP FUNCTION ignores argument names, because only the argument data types are needed to determine the function's identity.

 *arg\$1type*   
The data type of the input argument. You can supply a comma-separated list with a maximum of 32 data types.

 CASCADE   
Keyword specifying to automatically drop objects that depend on the function, such as views.   
To create a view that isn't dependent on a function, include the WITH NO SCHEMA BINDING clause in the view definition. For more information, see [CREATE VIEW](r_CREATE_VIEW.md).

 RESTRICT   
Keyword specifying that if any objects depend on the function, do not drop the function and return a message. This action is the default.

## Examples
<a name="r_DROP_FUNCTION-examples"></a>

The following example drops the function named `f_sqrt`:

```
drop function f_sqrt(int);
```

To remove a function that has dependencies, use the CASCADE option, as shown in the following example:

```
drop function f_sqrt(int)cascade;
```

# DROP GROUP
<a name="r_DROP_GROUP"></a>

Deletes a user group. This command isn't reversible. This command doesn't delete the individual users in a group.

See DROP USER to delete an individual user.

## Syntax
<a name="r_DROP_GROUP-synopsis"></a>

```
DROP GROUP name
```

## Parameter
<a name="r_DROP_GROUP-parameter"></a>

 *name*   
Name of the user group to delete.

## Example
<a name="r_DROP_GROUP-example"></a>

The following example deletes the `guests` user group:

```
DROP GROUP guests;
```

You can't drop a group if the group has any privileges on an object. If you attempt to drop such a group, you will receive the following error.

```
ERROR: group "guests" can't be dropped because the group has a privilege on some object
```

If the group has privileges for an object, you must revoke the privileges before dropping the group. To find the objects that the `guests` group has privileges for, use the following example. For more information about the metadata view used in the example, see [SVV\$1RELATION\$1PRIVILEGES](https://docs.aws.amazon.com//redshift/latest/dg/r_SVV_RELATION_PRIVILEGES.html).

```
SELECT DISTINCT namespace_name, relation_name, identity_name, identity_type 
FROM svv_relation_privileges
WHERE identity_type='group' AND identity_name='guests';

+----------------+---------------+---------------+---------------+
| namespace_name | relation_name | identity_name | identity_type |
+----------------+---------------+---------------+---------------+
| public         | table1        | guests        | group         |
+----------------+---------------+---------------+---------------+
| public         | table2        | guests        | group         |
+----------------+---------------+---------------+---------------+
```

The following example revokes all privileges on all tables in the `public` schema from the `guests` user group, and then drops the group.

```
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM GROUP guests;
DROP GROUP guests;
```

# DROP IDENTITY PROVIDER
<a name="r_DROP_IDENTITY_PROVIDER"></a>

Deletes an identity provider. This command isn't reversible. Only a superuser can drop an identity provider.

## Syntax
<a name="r_DROP_IDENTITY_PROVIDER-synopsis"></a>

```
DROP IDENTITY PROVIDER identity_provider_name [ CASCADE ]
```

## Parameters
<a name="r_DROP_IDENTITY_PROVIDER-parameter"></a>

 *identity\$1provider\$1name*   
Name of the identity provider to delete.

 CASCADE   
Deletes users and roles attached to the identity provider, when it is deleted.

## Example
<a name="r_DROP_IDENTITY_PROVIDER-example"></a>

The following example deletes the *oauth\$1provider* identity provider.

```
DROP IDENTITY PROVIDER oauth_provider;
```

If you drop the identity provider, some users may not be able to log in or use client tools configured to use the identity provider.

# DROP LIBRARY
<a name="r_DROP_LIBRARY"></a>

Removes a custom Python library from the database. Only the library owner or a superuser can drop a library. 

DROP LIBRARY can't be run inside a transaction block (BEGIN … END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

This command isn't reversible. The DROP LIBRARY command commits immediately. If a UDF that depends on the library is running concurrently, the UDF might fail, even if the UDF is running within a transaction.

For more information, see [CREATE LIBRARY](r_CREATE_LIBRARY.md). 

## Required privileges
<a name="r_DROP_LIBRARY-privileges"></a>

Following are required privileges for DROPLIBRARY:
+ Superuser
+ Users with the DROP LIBRARY privilege
+ Library owner

## Syntax
<a name="r_DROP_LIBRARY-synopsis"></a>

```
DROP LIBRARY library_name
```

## Parameters
<a name="r_DROP_LIBRARY-parameters"></a>

 *library\$1name*   
The name of the library.

# DROP MASKING POLICY
<a name="r_DROP_MASKING_POLICY"></a>

Drops a dynamic data masking policy from all databases. You can't drop a masking policy that's still attached to one or more tables. For more information on dynamic data masking, see [Dynamic data masking](t_ddm.md).

Superusers and users or roles that have the sys:secadmin role can drop a masking policy.

## Syntax
<a name="r_DROP_MASKING_POLICY-synopsis"></a>

```
DROP MASKING POLICY { policy_name | database_name.policy_name };
```

## Parameters
<a name="r_DROP_MASKING_POLICY-parameters"></a>

 *policy\$1name*   
The name of the masking policy to drop.

database\$1name  
The name of the database from where the policy to be dropped. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

For the usage of DROP MASKING POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

# DROP MODEL
<a name="r_DROP_MODEL"></a>

Removes a model from the database. Only the model owner or a superuser can drop a model. 

DROP MODEL also deletes all the associated prediction function that is derived from this model, all Amazon Redshift artifacts related to the model, and all Amazon S3 data related to the model. While the model is still being trained in Amazon SageMaker AI, DROP MODEL will cancel those operations.

This command isn't reversible. The DROP MODEL command commits immediately.

## Required permissions
<a name="r_DROP_MODEL-privileges"></a>

Following are required permissions for DROP MODEL:
+ Superuser
+ Users with the DROP MODEL permission
+ Model owner
+ Schema owner

## Syntax
<a name="r_DROP_MODEL-synopsis"></a>

```
DROP MODEL [ IF EXISTS ] model_name
```

## Parameters
<a name="r_DROP_MODEL-parameters"></a>

 *IF EXISTS*   
A clause that indicates that if the specified schema already exists, the command should make no changes and return a message that the schema exists.

 *model\$1name*   
The name of the model. The model name in a schema must be unique.

## Examples
<a name="r_DROP_MODEL-examples"></a>

The following example drops the model demo\$1ml.customer\$1churn.

```
DROP MODEL demo_ml.customer_churn
```

# DROP MATERIALIZED VIEW
<a name="materialized-view-drop-sql-command"></a>

Removes a materialized view.

For more information about materialized views, see [Materialized views in Amazon Redshift](materialized-view-overview.md).

## Syntax
<a name="mv_DROP_MATERIALIZED_VIEW-synopsis"></a>

```
DROP MATERIALIZED VIEW [ IF EXISTS ] mv_name [, ... ] [ CASCADE | RESTRICT ]
```

## Parameters
<a name="mv_DROP_MATERIALIZED_VIEW-parameters"></a>

IF EXISTS  
A clause that specifies to check if the named materialized view exists. If the materialized view doesn't exist, then the `DROP MATERIALIZED VIEW` command returns an error message. This clause is useful when scripting, to keep the script from failing if you drop a nonexistent materialized view.

*mv\$1name*  
The name of the materialized view to be dropped.

CASCADE  
A clause that indicates to automatically drop objects that the materialized view depends on, such as other views.

RESTRICT  
A clause that indicates to not drop the materialized view if any objects depend on it. This is the default.

## Usage Notes
<a name="mv_DROP_MATERIALIZED_VIEW-usage"></a>

Only the owner of a materialized view can use `DROP MATERIALIZED VIEW` on that view. A superuser or a user who has specifically been granted DROP privileges can be exceptions to this.

When you write a drop statement for a materialized view and a view with a matching name exists, it results in an error that instructs you to use DROP VIEW. An error occurs even in a case where you use `DROP MATERIALIZED VIEW IF EXISTS`.

## Example
<a name="mv_DROP_MATERIALIZED_VIEW-examples"></a>

The following example drops the `tickets_mv` materialized view.

```
DROP MATERIALIZED VIEW tickets_mv;
```

# DROP PROCEDURE
<a name="r_DROP_PROCEDURE"></a>

Drops a procedure. To drop a procedure, both the procedure name and input argument data types (signature), are required. Optionally, you can include the full argument data types, including OUT arguments. To find the signature for a procedure, use the [SHOW PROCEDURE](r_SHOW_PROCEDURE.md) command. For more information about procedure signatures, see [PG\$1PROC\$1INFO](r_PG_PROC_INFO.md).

## Required privileges
<a name="r_DROP_PROCEDURE-privileges"></a>

Following are required privileges for DROP PROCEDURE:
+ Superuser
+ Users with the DROP PROCEDURE privilege
+ Procedure owner

## Syntax
<a name="r_DROP_PROCEDURE-synopsis"></a>

```
DROP PROCEDURE sp_name ( [ [ argname ] [ argmode ] argtype [, ...] ] )
```

## Parameters
<a name="r_DROP_PROCEDURE-parameters"></a>

 *sp\$1name*   
The name of the procedure to be removed. 

 *argname*   
The name of an input argument. DROP PROCEDURE ignores argument names, because only the argument data types are needed to determine the procedure's identity. 

 *argmode*   
The mode of an argument, which can be IN, OUT, or INOUT. OUT arguments are optional because they aren't used to identify a stored procedure. 

 *argtype*   
The data type of the input argument. For a list of the supported data types, see [Data types](c_Supported_data_types.md). 

## Examples
<a name="r_DROP_PROCEDURE-examples"></a>

The following example drops a stored procedure named `quarterly_revenue`.

```
DROP PROCEDURE quarterly_revenue(volume INOUT bigint, at_price IN numeric,result OUT int);
```

# DROP RLS POLICY
<a name="r_DROP_RLS_POLICY"></a>

Drops a row-level security policy for all tables in all databases.

Superusers and users or roles that have the sys:secadmin role can drop a policy.

## Syntax
<a name="r_DROP_RLS_POLICY-synopsis"></a>

```
DROP RLS POLICY [ IF EXISTS ] 
{ policy_name | database_name.policy_name }
[ CASCADE | RESTRICT ]
```

## Parameters
<a name="r_DROP_RLS_POLICY-parameters"></a>

 *IF EXISTS*   
A clause that indicates if the specified policy already exists.

 *policy\$1name*   
The name of the policy.

database\$1name  
The name of the database from where the policy to be dropped. The database can be the connected database or a database that supports Amazon Redshift federated permissions.

 *CASCADE*   
A clause that indicates to automatically detach the policy from all attached tables before dropping the policy.

 *RESTRICT*   
A clause that indicates not to drop the policy when it is attached to some tables. This is the default.

For the usage of DROP RLS POLICY on Amazon Redshift Federated Permissions Catalog, see [ Managing access control with Amazon Redshift federated permissions](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html).

## Examples
<a name="r_DROP_RLS_POLICY-examples"></a>

The following example drops the row-level security policy.

```
DROP RLS POLICY policy_concerts;
```

# DROP ROLE
<a name="r_DROP_ROLE"></a>

Removes a role from a database. Only the role owner who either created the role, a user with the WITH ADMIN option, or a superuser can drop a role.

You can't drop a role that is granted to a user or another role that is dependent on this role.

## Required privileges
<a name="r_DROP_ROLE-privileges"></a>

Following are the required privileges for DROP ROLE:
+ Superuser
+ Role owner who is either the user that created the role or a user that has been granted the role with the WITH ADMIN OPTION privilege.

## Syntax
<a name="r_DROP_ROLE-synopsis"></a>

```
DROP ROLE role_name [ FORCE | RESTRICT ] 
```

## Parameters
<a name="r_DROP_ROLE-parameters"></a>

*role\$1name*  
The name of the role.

[ FORCE \$1 RESTRICT ]  
The default setting is RESTRICT. Amazon Redshift throws an error when you try to drop a role that has inherited another role. Use FORCE to remove all role assignments, if any exists. 

## Examples
<a name="r_DROP_ROLE-examples"></a>

The following example drops the role `sample_role`.

```
DROP ROLE sample_role FORCE;
```

The following example attempts to drop the role sample\$1role1 that has been granted to a user with the default RESTRICT option.

```
CREATE ROLE sample_role1;
GRANT ROLE sample_role1 TO user1;
DROP ROLE sample_role1;
ERROR:  cannot drop this role since it has been granted on a user
```

To successfully drop the sample\$1role1 that has been granted to a user, use the FORCE option.

```
DROP ROLE sample_role1 FORCE;
```

The following example attempts to drop the role sample\$1role2 that has another role dependent on it with the default RESTRICT option.

```
CREATE ROLE sample_role1;
CREATE ROLE sample_role2;
GRANT ROLE sample_role1 TO sample_role2;
DROP ROLE sample_role2;
ERROR:  cannot drop this role since it depends on another role
```

To successfully drop the sample\$1role2 that has another role dependent on it, use the FORCE option.

```
DROP ROLE sample_role2 FORCE;
```

# DROP SCHEMA
<a name="r_DROP_SCHEMA"></a>

Deletes a schema. For an external schema, you can also drop the external database associated with the schema. This command isn't reversible.

## Required privileges
<a name="r_DROP_SCHEMA-privileges"></a>

Following are required privileges for DROP SCHEMA:
+ Superuser
+ Schema owner
+ Users with the DROP SCHEMA privilege

## Syntax
<a name="r_DROP_SCHEMA-synopsis"></a>

```
DROP SCHEMA [ IF EXISTS ] name [, ...]
[ DROP EXTERNAL DATABASE ]
[ CASCADE | RESTRICT ]
```

## Parameters
<a name="r_DROP_SCHEMA-parameters"></a>

IF EXISTS  
Clause that indicates that if the specified schema doesn’t exist, the command should make no changes and return a message that the schema doesn't exist, rather than terminating with an error.  
This clause is useful when scripting, so the script doesn’t fail if DROP SCHEMA runs against a nonexistent schema.

 *name*   
Names of the schemas to drop. You can specify multiple schema names separated by commas.

 DROP EXTERNAL DATABASE   
Clause that indicates that if an external schema is dropped, drop the external database associated with the external schema, if one exists. If no external database exists, the command returns a message stating that no external database exists. If multiple external schemas are dropped, all databases associated with the specified schemas are dropped.   
If an external database contains dependent objects such as tables, include the CASCADE option to drop the dependent objects as well.   
When you drop an external database, the database is also dropped for any other external schemas associated with the database. Tables defined in other external schemas using the database are also dropped.   
DROP EXTERNAL DATABASE doesn't support external databases stored in a HIVE metastore. 

CASCADE  
Keyword that indicates to automatically drop all objects in the schema. If DROP EXTERNAL DATABASE is specified, all objects in the external database are also dropped.

RESTRICT  
Keyword that indicates not to drop a schema or external database if it contains any objects. This action is the default.

## Example
<a name="r_DROP_SCHEMA-example"></a>

The following example deletes a schema named S\$1SALES. This example uses RESTRICT as a safety mechanism so that the schema isn't deleted if it contains any objects. In this case, you need to delete the schema objects before deleting the schema.

```
drop schema s_sales restrict;
```

The following example deletes a schema named S\$1SALES and all objects that depend on that schema.

```
drop schema s_sales cascade;
```

The following example either drops the S\$1SALES schema if it exists, or does nothing and returns a message if it doesn't.

```
drop schema if exists s_sales;
```

The following example deletes an external schema named S\$1SPECTRUM and the external database associated with it. This example uses RESTRICT so that the schema and database aren't deleted if they contain any objects. In this case, you need to delete the dependent objects before deleting the schema and the database.

```
drop schema s_spectrum drop external database restrict;
```

The following example deletes multiple schemas and the external databases associated with them, along with any dependent objects. 

```
drop schema s_sales, s_profit, s_revenue drop external database cascade;
```

# DROP TABLE
<a name="r_DROP_TABLE"></a>

Removes a table from a database. 

If you are trying to empty a table of rows, without removing the table, use the DELETE or TRUNCATE command. 

DROP TABLE removes constraints that exist on the target table. Multiple tables can be removed with a single DROP TABLE command. 

DROP TABLE with an external table can't be run inside a transaction (BEGIN … END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md).

To find an example where the DROP privilege is granted to a group, see GRANT [Examples](r_GRANT-examples.md).

## Required privileges
<a name="r_DROP_TABLE-privileges"></a>

Following are required privileges for DROP TABLE:
+ Superuser
+ Users with the DROP TABLE privilege
+ Table owner with the USAGE privilege on the schema

## Syntax
<a name="r_DROP_TABLE-synopsis"></a>

```
DROP TABLE [ IF EXISTS ] name [, ...] [ CASCADE | RESTRICT ]
```

## Parameters
<a name="r_DROP_TABLE-parameters"></a>

IF EXISTS  
Clause that indicates that if the specified table doesn’t exist, the command should make no changes and return a message that the table doesn't exist, rather than terminating with an error.  
This clause is useful when scripting, so the script doesn’t fail if DROP TABLE runs against a nonexistent table.

 *name*   
Name of the table to drop. 

CASCADE  
Clause that indicates to automatically drop objects that depend on the table, such as views.  
To create a view that isn't dependent on other database objects, such as views and tables, include the WITH NO SCHEMA BINDING clause in the view definition. For more information, see [CREATE VIEW](r_CREATE_VIEW.md).

RESTRICT   
Clause that indicates not to drop the table if any objects depend on it. This action is the default.

## Examples
<a name="r_DROP_TABLE-examples"></a>

 **Dropping a table with no dependencies** 

The following example creates and drops a table called FEEDBACK that has no dependencies: 

```
create table feedback(a int);

drop table feedback;
```

 If a table contains columns that are referenced by views or other tables, Amazon Redshift displays a message such as the following. 

```
Invalid operation: cannot drop table feedback because other objects depend on it
```

 **Dropping two tables simultaneously** 

The following command set creates a FEEDBACK table and a BUYERS table and then drops both tables with a single command: 

```
create table feedback(a int);

create table buyers(a int);

drop table feedback, buyers;
```

 **Dropping a table with a dependency** 

The following steps show how to drop a table called FEEDBACK using the CASCADE switch. 

First, create a simple table called FEEDBACK using the CREATE TABLE command: 

```
create table feedback(a int);
```

 Next, use the CREATE VIEW command to create a view called FEEDBACK\$1VIEW that relies on the table FEEDBACK: 

```
create view feedback_view as select * from feedback;
```

 The following example drops the table FEEDBACK and also drops the view FEEDBACK\$1VIEW, because FEEDBACK\$1VIEW is dependent on the table FEEDBACK: 

```
drop table feedback cascade;
```

 **Viewing the dependencies for a table** 

To return the dependencies for your table, use the following example. Replace *my\$1schema* and *my\$1table* with your own schema and table. 

```
SELECT dependent_ns.nspname as dependent_schema
, dependent_view.relname as dependent_view 
, source_ns.nspname as source_schema
, source_table.relname as source_table
, pg_attribute.attname as column_name
FROM pg_depend 
JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid 
JOIN pg_class as dependent_view ON pg_rewrite.ev_class = dependent_view.oid 
JOIN pg_class as source_table ON pg_depend.refobjid = source_table.oid 
JOIN pg_attribute ON pg_depend.refobjid = pg_attribute.attrelid 
    AND pg_depend.refobjsubid = pg_attribute.attnum 
JOIN pg_namespace dependent_ns ON dependent_ns.oid = dependent_view.relnamespace
JOIN pg_namespace source_ns ON source_ns.oid = source_table.relnamespace
WHERE 
source_ns.nspname = 'my_schema'
AND source_table.relname = 'my_table'
AND pg_attribute.attnum > 0 
ORDER BY 1,2
LIMIT 10;
```

To drop *my\$1table* and its dependencies, use the following example. This example also returns all dependencies for the table that has been dropped.

```
DROP TABLE my_table CASCADE;
         
SELECT dependent_ns.nspname as dependent_schema
, dependent_view.relname as dependent_view 
, source_ns.nspname as source_schema
, source_table.relname as source_table
, pg_attribute.attname as column_name
FROM pg_depend 
JOIN pg_rewrite ON pg_depend.objid = pg_rewrite.oid 
JOIN pg_class as dependent_view ON pg_rewrite.ev_class = dependent_view.oid 
JOIN pg_class as source_table ON pg_depend.refobjid = source_table.oid 
JOIN pg_attribute ON pg_depend.refobjid = pg_attribute.attrelid 
    AND pg_depend.refobjsubid = pg_attribute.attnum 
JOIN pg_namespace dependent_ns ON dependent_ns.oid = dependent_view.relnamespace
JOIN pg_namespace source_ns ON source_ns.oid = source_table.relnamespace
WHERE 
source_ns.nspname = 'my_schema'
AND source_table.relname = 'my_table'
AND pg_attribute.attnum > 0 
ORDER BY 1,2
LIMIT 10;

+------------------+----------------+---------------+--------------+-------------+
| dependent_schema | dependent_view | source_schema | source_table | column_name |
+------------------+----------------+---------------+--------------+-------------+
```

 **Dropping a table Using IF EXISTS** 

The following example either drops the FEEDBACK table if it exists, or does nothing and returns a message if it doesn't: 

```
drop table if exists feedback;
```

# DROP TEMPLATE
<a name="r_DROP_TEMPLATE"></a>

Drops a template from a database.

## Required privileges
<a name="r_DROP_TEMPLATE-privileges"></a>

To drop a template, you must have one of the following:
+ Superuser privileges
+ DROP TEMPLATE privilege and USAGE privilege on the schema containing the template

## Syntax
<a name="r_DROP_TEMPLATE-synopsis"></a>

```
DROP TEMPLATE [database_name.][schema_name.]template_name;
```

## Parameters
<a name="r_DROP_TEMPLATE-parameters"></a>

 *database\$1name*   
(Optional) The name of the database in which the template is created. If not specified, the current database is used. 

 *schema\$1name*   
(Optional) The name of the schema in which the template is created. If not specified, the template is searched for in the current search path. 

 *template\$1name*   
The name of the template to remove. In the following example, the database name is `demo_database`, the schema name is `demo_schema`, and the template name is `test`.  

```
DROP TEMPLATE demo_database.demo_schema.test;
```

## Examples
<a name="r_DROP_TEMPLATE-examples"></a>

The following example drops the template test\$1template from the current schema:

```
DROP TEMPLATE test_template;
```

The following example drops the template test\$1template from the schema test\$1schema:

```
DROP TEMPLATE test_schema.test_template;
```

# DROP USER
<a name="r_DROP_USER"></a>

Drops a user from a database. Multiple users can be dropped with a single DROP USER command. You must be a database superuser or have the DROP USER permission to run this command.

## Syntax
<a name="r_DROP_USER-synopsis"></a>

```
DROP USER [ IF EXISTS ] name [, ... ]
```

## Parameters
<a name="r_DROP_USER-parameters"></a>

IF EXISTS  
Clause that indicates that if the specified user doesn’t exist, the command should make no changes and return a message that the user doesn't exist, rather than terminating with an error.  
This clause is useful when scripting, so the script doesn’t fail if DROP USER runs against a nonexistent user.

 *name*   
Name of the user to remove. You can specify multiple users, with a comma separating each user name from the next.

## Usage notes
<a name="r_DROP_USER-notes"></a>

You can't drop the user named `rdsdb` or the administrator user of the database which is typically named `awsuser` or `admin`.

You can't drop a user if the user owns any database object, such as a schema, database, table, or view, or if the user has any privileges on a database, table, column, or group. If you attempt to drop such a user, you receive one of the following errors.

```
ERROR: user "username" can't be dropped because the user owns some object [SQL State=55006]

ERROR: user "username" can't be dropped because the user has a privilege on some object [SQL State=55006]
```

For detailed instructions on how to find the objects owned by a database user, see [How do I resolve the "user cannot be dropped" error in Amazon Redshift?](https://repost.aws/knowledge-center/redshift-user-cannot-be-dropped) in *Knowledge Center*.

**Note**  
Amazon Redshift checks only the current database before dropping a user. DROP USER doesn't return an error if the user owns database objects or has any privileges on objects in another database. If you drop a user that owns objects in another database, the owner for those objects is changed to 'unknown'. 

If a user owns an object, first drop the object or change its ownership to another user before dropping the original user. If the user has privileges for an object, first revoke the privileges before dropping the user. The following example shows dropping an object, changing ownership, and revoking privileges before dropping the user.

```
drop database dwdatabase;
alter schema dw owner to dwadmin;
revoke all on table dwtable from dwuser;
drop user dwuser;
```

## Examples
<a name="r_DROP_USER-examples"></a>

The following example drops a user called paulo:

```
drop user paulo;
```

The following example drops two users, paulo and martha:

```
drop user paulo, martha;
```

The following example drops the user paulo if it exists, or does nothing and returns a message if it doesn't:

```
drop user if exists paulo;
```

# DROP VIEW
<a name="r_DROP_VIEW"></a>

Removes a view from the database. Multiple views can be dropped with a single DROP VIEW command. This command isn't reversible.

## Required privileges
<a name="r_DROP_VIEW-privileges"></a>

Following are required privileges for DROP VIEW:
+ Superuser
+ Users with the DROP VIEW privilege
+ View owner

## Syntax
<a name="r_DROP_VIEW-synopsis"></a>

```
DROP VIEW [ IF EXISTS ] name [, ... ] [ CASCADE | RESTRICT ] 
```

## Parameters
<a name="r_DROP_VIEW-parameters"></a>

IF EXISTS  
Clause that indicates that if the specified view doesn’t exist, the command should make no changes and return a message that the view doesn't exist, rather than terminating with an error.  
This clause is useful when scripting, so the script doesn’t fail if DROP VIEW runs against a nonexistent view.

 *name*   
Name of the view to be removed.

CASCADE  
Clause that indicates to automatically drop objects that depend on the view, such as other views.  
To create a view that isn't dependent on other database objects, such as views and tables, include the WITH NO SCHEMA BINDING clause in the view definition. For more information, see [CREATE VIEW](r_CREATE_VIEW.md).  
Note that if you include CASCADE and the count of database objects dropped runs to ten or more, it's possible that your database client won't list all of the dropped objects in the summary results. This is typically because SQL client tools have default limitations on the results returned.

RESTRICT  
Clause that indicates not to drop the view if any objects depend on it. This action is the default.

## Examples
<a name="r_DROP_VIEW-examples"></a>

The following example drops the view called *event*:

```
drop view event;
```

To remove a view that has dependencies, use the CASCADE option. For example, say we start with a table called EVENT. We then create the eventview view of the EVENT table, using the CREATE VIEW command, as shown in the following example: 

```
create view eventview as
select dateid, eventname, catid
from event where catid = 1;
```

Now, we create a second view called *myeventview*, that is based on the first view *eventview*:

```
create view myeventview as
select eventname, catid
from eventview where eventname <> ' ';
```

At this point, two views have been created: *eventview* and *myeventview*.

The *myeventview* view is a child view with*eventview* as its parent.

To delete the *eventview* view, the obvious command to use is the following: 

```
drop view eventview;
```

Notice that if you run this command in this case, you get the following error:

```
drop view eventview;
ERROR: can't drop view eventview because other objects depend on it
HINT: Use DROP ... CASCADE to drop the dependent objects too.
```

To remedy this, run the following command (as suggested in the error message): 

```
drop view eventview cascade;
```

Both *eventview* and *myeventview* have now been dropped successfully.

The following example either drops the *eventview* view if it exists, or does nothing and returns a message if it doesn't:

```
drop view if exists eventview;
```

# END
<a name="r_END"></a>

Commits the current transaction. Performs exactly the same function as the COMMIT command.

See [COMMIT](r_COMMIT.md) for more detailed documentation.

## Syntax
<a name="r_END-synopsis"></a>

```
END [ WORK | TRANSACTION ]
```

## Parameters
<a name="r_END-parameters"></a>

WORK  
Optional keyword.

TRANSACTION  
Optional keyword; WORK and TRANSACTION are synonyms.

## Examples
<a name="r_END-examples"></a>

The following examples all end the transaction block and commit the transaction:

```
end;
```

```
end work;
```

```
end transaction;
```

After any of these commands, Amazon Redshift ends the transaction block and commits the changes.

# EXECUTE
<a name="r_EXECUTE"></a>

Runs a previously prepared statement. 

## Syntax
<a name="r_EXECUTE-synopsis"></a>

```
EXECUTE plan_name [ (parameter [, ...]) ]
```

## Parameters
<a name="r_EXECUTE-parameters"></a>

 *plan\$1name*   
Name of the prepared statement to be run. 

 *parameter*   
The actual value of a parameter to the prepared statement. This must be an expression yielding a value of a type compatible with the data type specified for this parameter position in the PREPARE command that created the prepared statement. 

## Usage notes
<a name="r_EXECUTE_usage_notes"></a>

EXECUTE is used to run a previously prepared statement. Because prepared statements only exist for the duration of a session, the prepared statement must have been created by a PREPARE statement run earlier in the current session. 

If the previous PREPARE statement specified some parameters, a compatible set of parameters must be passed to the EXECUTE statement, or else Amazon Redshift returns an error. Unlike functions, prepared statements aren't overloaded based on the type or number of specified parameters; the name of a prepared statement must be unique within a database session. 

When an EXECUTE command is issued for the prepared statement, Amazon Redshift may optionally revise the query execution plan (to improve performance based on the specified parameter values) before running the prepared statement. Also, for each new execution of a prepared statement, Amazon Redshift may revise the query execution plan again based on the different parameter values specified with the EXECUTE statement. To examine the query execution plan that Amazon Redshift has chosen for any given EXECUTE statements, use the [EXPLAIN](r_EXPLAIN.md) command. 

For examples and more information on the creation and usage of prepared statements, see [PREPARE](r_PREPARE.md). 

## See also
<a name="r_EXECUTE-see-also"></a>

 [DEALLOCATE](r_DEALLOCATE.md), [PREPARE](r_PREPARE.md) 

# EXPLAIN
<a name="r_EXPLAIN"></a>

Displays the execution plan for a query statement without running the query. For information about the query analysis workflow, see [Query analysis workflow](c-query-analysis-process.md).

## Syntax
<a name="r_EXPLAIN-synopsis"></a>

```
EXPLAIN [ VERBOSE ] query
```

## Parameters
<a name="r_EXPLAIN-parameters"></a>

VERBOSE   
Displays the full query plan instead of just a summary.

 *query*   
Query statement to explain. The query can be a SELECT, INSERT, CREATE TABLE AS, UPDATE, or DELETE statement.

## Usage notes
<a name="r_EXPLAIN-usage-notes"></a>

EXPLAIN performance is sometimes influenced by the time it takes to create temporary tables. For example, a query that uses the common subexpression optimization requires temporary tables to be created and analyzed in order to return the EXPLAIN output. The query plan depends on the schema and statistics of the temporary tables. Therefore, the EXPLAIN command for this type of query might take longer to run than expected.

You can use EXPLAIN only for the following commands:
+ SELECT
+ SELECT INTO
+ CREATE TABLE AS
+ INSERT
+ UPDATE
+ DELETE

The EXPLAIN command will fail if you use it for other SQL commands, such as data definition language (DDL) or database operations.

The EXPLAIN output relative unit costs are used by Amazon Redshift to choose a query plan. Amazon Redshift compares the sizes of various resource estimates to determine the plan.

## Query planning and execution steps
<a name="r_EXPLAIN-query-planning-and-execution-steps"></a>

The execution plan for a specific Amazon Redshift query statement breaks down execution and calculation of a query into a discrete sequence of steps and table operations that eventually produce a final result set for the query. For information about query planning, see [Query processing](c-query-processing.md).

The following table provides a summary of steps that Amazon Redshift can use in developing an execution plan for any query a user submits for execution.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_EXPLAIN.html)

## Using EXPLAIN for RLS
<a name="r_EXPLAIN-RLS"></a>

If a query contains a table that is subject to row-level security (RLS) policies, EXPLAIN displays a special RLS SecureScan node. Amazon Redshift also logs the same node type to the STL\$1EXPLAIN system table. EXPLAIN doesn't reveal the RLS predicate that applies to dim\$1tbl. The RLS SecureScan node type serves as an indicator that the execution plan contains additional operations that are invisible to the current user.

The following example illustrates an RLS SecureScan node.

```
EXPLAIN
SELECT D.cint
FROM fact_tbl F INNER JOIN dim_tbl D ON F.k_dim = D.k
WHERE F.k_dim / 10 > 0;
                               QUERY PLAN
------------------------------------------------------------------------
 XN Hash Join DS_DIST_ALL_NONE  (cost=0.08..0.25 rows=1 width=4)
   Hash Cond: ("outer".k_dim = "inner"."k")
   ->  *XN* *RLS SecureScan f  (cost=0.00..0.14 rows=2 width=4)*
         Filter: ((k_dim / 10) > 0)
   ->  XN Hash  (cost=0.07..0.07 rows=2 width=8)
         ->  XN Seq Scan on dim_tbl d  (cost=0.00..0.07 rows=2 width=8)
               Filter: (("k" / 10) > 0)
```

To enable full investigation of query plans that are subject to RLS, Amazon Redshift offers the EXPLAIN RLS system permissions. Users that have been granted this permission can inspect complete query plans that also include RLS predicates. 

The following example illustrates an additional Seq Scan below the RLS SecureScan node also includes the RLS policy predicate (k\$1dim > 1).

```
EXPLAIN SELECT D.cint
FROM fact_tbl F INNER JOIN dim_tbl D ON F.k_dim = D.k
WHERE F.k_dim / 10 > 0;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 XN Hash Join DS_DIST_ALL_NONE  (cost=0.08..0.25 rows=1 width=4)
   Hash Cond: ("outer".k_dim = "inner"."k")
   *->  XN RLS SecureScan f  (cost=0.00..0.14 rows=2 width=4)
         Filter: ((k_dim / 10) > 0)*
         ->  *XN* *Seq Scan on fact_tbl rls_table  (cost=0.00..0.06 rows=5 width=8)
               Filter: (k_dim > 1)*
   ->  XN Hash  (cost=0.07..0.07 rows=2 width=8)
         ->  XN Seq Scan on dim_tbl d  (cost=0.00..0.07 rows=2 width=8)
               Filter: (("k" / 10) > 0)
```

While the EXPLAIN RLS permission is granted to a user, Amazon Redshift logs the full query plan including RLS predicates in the STL\$1EXPLAIN system table. Queries that are run while this permission is not granted will be logged without RLS internals. Granting or removing the EXPLAIN RLS permission won't change what Amazon Redshift has logged to STL\$1EXPLAIN for previous queries.

### AWS Lake Formation-RLS protected Redshift relations
<a name="r_EXPLAIN_RLS-LF"></a>

The following example illustrates an LF SecureScan node, which you can use to view Lake Formation-RLS relations.

```
EXPLAIN
SELECT *
FROM lf_db.public.t_share
WHERE a > 1;
QUERY PLAN
---------------------------------------------------------------
XN LF SecureScan t_share  (cost=0.00..0.02 rows=2 width=11)
(2 rows)
```

## Examples
<a name="r_EXPLAIN-examples"></a>

**Note**  
For these examples, the sample output might vary depending on Amazon Redshift configuration.

The following example returns the query plan for a query that selects the EVENTID, EVENTNAME, VENUEID, and VENUENAME from the EVENT and VENUE tables:

```
explain
select eventid, eventname, event.venueid, venuename
from event, venue
where event.venueid = venue.venueid;
```

```
                                QUERY PLAN
--------------------------------------------------------------------------
XN Hash Join DS_DIST_OUTER  (cost=2.52..58653620.93 rows=8712 width=43)
Hash Cond: ("outer".venueid = "inner".venueid)
->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=23)
->  XN Hash  (cost=2.02..2.02 rows=202 width=22)
->  XN Seq Scan on venue  (cost=0.00..2.02 rows=202 width=22)
(5 rows)
```

The following example returns the query plan for the same query with verbose output:

```
explain verbose
select eventid, eventname, event.venueid, venuename
from event, venue
where event.venueid = venue.venueid;
```

```
                                QUERY PLAN
--------------------------------------------------------------------------
{HASHJOIN
:startup_cost 2.52
:total_cost 58653620.93
:plan_rows 8712
:plan_width 43
:best_pathkeys <>
:dist_info DS_DIST_OUTER
:dist_info.dist_keys (
TARGETENTRY
{
VAR
:varno 2
:varattno 1
...

XN Hash Join DS_DIST_OUTER  (cost=2.52..58653620.93 rows=8712 width=43)
Hash Cond: ("outer".venueid = "inner".venueid)
->  XN Seq Scan on event  (cost=0.00..87.98 rows=8798 width=23)
->  XN Hash  (cost=2.02..2.02 rows=202 width=22)
->  XN Seq Scan on venue  (cost=0.00..2.02 rows=202 width=22)
(519 rows)
```

The following example returns the query plan for a CREATE TABLE AS (CTAS) statement: 

```
explain create table venue_nonulls as
select * from venue
where venueseats is not null;

QUERY PLAN
-----------------------------------------------------------
XN Seq Scan on venue  (cost=0.00..2.02 rows=187 width=45)
Filter: (venueseats IS NOT NULL)
(2 rows)
```

# FETCH
<a name="fetch"></a>

Retrieves rows using a cursor. For information about declaring a cursor, see [DECLARE](declare.md).

FETCH retrieves rows based on the current position within the cursor. When a cursor is created, it is positioned before the first row. After a FETCH, the cursor is positioned on the last row retrieved. If FETCH runs off the end of the available rows, such as following a FETCH ALL, the cursor is left positioned after the last row. 

FORWARD 0 fetches the current row without moving the cursor; that is, it fetches the most recently fetched row. If the cursor is positioned before the first row or after the last row, no row is returned. 

When the first row of a cursor is fetched, the entire result set is materialized on the leader node, in memory or on disk, if needed. Because of the potential negative performance impact of using cursors with large result sets, we recommend using alternative approaches whenever possible. For more information, see [Performance considerations when using cursors](declare.md#declare-performance).

For more information, see [DECLARE](declare.md), [CLOSE](close.md). 

## Syntax
<a name="fetch-synopsis"></a>

```
FETCH [ NEXT | ALL | {FORWARD [ count | ALL ] } ] FROM cursor
```

## Parameters
<a name="fetch-parameters"></a>

NEXT  
Fetches the next row. This is the default.

ALL  
Fetches all remaining rows. (Same as FORWARD ALL.) ALL isn't supported for single-node clusters.

FORWARD [ *count* \$1 ALL ]   
Fetches the next *count* rows, or all remaining rows. `FORWARD 0` fetches the current row. For single-node clusters, the maximum value for count is `1000`. FORWARD ALL isn't supported for single-node clusters. 

*cursor*   
Name of the new cursor. 

## FETCH example
<a name="fetch-example"></a>

The following example declares a cursor named LOLLAPALOOZA to select sales information for the Lollapalooza event, and then fetches rows from the result set using the cursor:

```
-- Begin a transaction

begin;

-- Declare a cursor

declare lollapalooza cursor for
select eventname, starttime, pricepaid/qtysold as costperticket, qtysold
from sales, event
where sales.eventid = event.eventid
and eventname='Lollapalooza';

-- Fetch the first 5 rows in the cursor lollapalooza:

fetch forward 5 from lollapalooza;

  eventname   |      starttime      | costperticket | qtysold
--------------+---------------------+---------------+---------
 Lollapalooza | 2008-05-01 19:00:00 |   92.00000000 |       3
 Lollapalooza | 2008-11-15 15:00:00 |  222.00000000 |       2
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       3
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       4
 Lollapalooza | 2008-04-17 15:00:00 |  239.00000000 |       1
(5 rows)

-- Fetch the next row:

fetch next from lollapalooza;

  eventname   |      starttime      | costperticket | qtysold
--------------+---------------------+---------------+---------
 Lollapalooza | 2008-10-06 14:00:00 |  114.00000000 |       2

-- Close the cursor and end the transaction:

close lollapalooza;
commit;
```

# GRANT
<a name="r_GRANT"></a>

Defines access permissions for a user or role.

Permissions include access options such as being able to read data in tables and views, write data, create tables, and drop tables. Use this command to give specific permissions for a table, database, schema, function, procedure, language, or column. To revoke permissions from a database object, use the [REVOKE](r_REVOKE.md) command. 

Permissions also include the following datashare producer access options:
+  Granting datashare access to consumer namespaces and accounts. 
+  Granting permission to alter a datashare by adding or removing objects from the datashare. 
+  Granting permission to share a datashare by adding or removing consumer namespaces from the datashare. 

Datashare consumer access options are as follows:
+ Granting users full access to databases created from a datashare or to external schemas that point to such databases.
+ Granting users object-level permissions on databases created from a datashare like you can for local database objects. To grant this level of permission, you must use the WITH PERMISSIONS clause when creating a database from the datashare. For more information, see [CREATE DATABASE](r_CREATE_DATABASE.md).

For more information about datashare permissions, see [Permissions you can grant to datashares](permissions-datashares.md).

Permissions also include the following Amazon Redshift Federated Permissions Catalog:
+ Granting table-level permissions to users and roles.
+ Granting fine-grained column-level permissions on tables, views and materialized views.
+ Granting scoped permissions to users and roles.
+ Granting database-level permissions on Amazon Redshift Federated Permissions Catalog.

For more information about managing permissions on Amazon Redshift Federated Permissions Catalog, see [Managing access control on Amazon Redshift federated permissions catalogGrant / Revoke](federated-permissions-managing-access.md). For more information about Amazon Redshift Federated Permissions Catalog supported grant/revoke syntaxes, see [Grant/Revoke](https://docs.aws.amazon.com/redshift/latest/dg/federated-permissions-managing-access.html#federated-permissions-managing-access-grant-revoke).

Permissions also include the CONNECT privilege for AWS IAM Identity Center federated users. This privilege enables administrators to control user access through granular permissions at each Amazon Redshift workgroup(s) or cluster(s) where Amazon Redshift Federated Permissions are enabled. Amazon Redshift administrator can specify which AWS IAM Identity Center federated user(s) or group(s) have access to directly connect to the Amazon Redshift workgroup, providing fine-grained control over the AWS IAM Identity Center user access at each workgroup or cluster.

You can also grant roles to manage database permissions and control what users can do relative to your data. By defining roles and assigning roles to users, you can limit the the actions those users can take, such as limiting users to only the CREATE TABLE and INSERT commands. For more information about the CREATE ROLE command, see [CREATE ROLE](r_CREATE_ROLE.md). Amazon Redshift has some system-defined roles that you can also use to grant specific permissions to your users. For more information, see [Amazon Redshift system-defined roles](r_roles-default.md).

You can only GRANT or REVOKE USAGE permissions on an external schema to database users and user groups that use the ON SCHEMA syntax. When using ON EXTERNAL SCHEMA with AWS Lake Formation, you can only GRANT and REVOKE permissions to an AWS Identity and Access Management (IAM) role. For the list of permissions, see the syntax.

For stored procedures, the only permission that you can grant is EXECUTE.

You can't run GRANT (on an external resource) within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

To see which permissions users have been granted for a database, use [HAS\$1DATABASE\$1PRIVILEGE](r_HAS_DATABASE_PRIVILEGE.md). To see which permissions users have been granted for a schema, use [HAS\$1SCHEMA\$1PRIVILEGE](r_HAS_SCHEMA_PRIVILEGE.md). To see which permissions users have been granted for a table, use [HAS\$1TABLE\$1PRIVILEGE](r_HAS_TABLE_PRIVILEGE.md). 

## Syntax
<a name="r_GRANT-synopsis"></a>


```
GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES | ALTER | TRUNCATE } [,...] | ALL [ PRIVILEGES ] }
    ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { { CREATE | USAGE | TEMPORARY | TEMP | ALTER } [,...] | ALL [ PRIVILEGES ] }
    ON DATABASE db_name [, ...]
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { { CREATE | USAGE | ALTER | DROP } [,...] | ALL [ PRIVILEGES ] }
    ON SCHEMA schema_name [, ...]
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
    ON { FUNCTION function_name ( [ [ argname ] argtype [, ...] ] ) [, ...] | ALL FUNCTIONS IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
    ON { PROCEDURE procedure_name ( [ [ argname ] argtype [, ...] ] ) [, ...] | ALL PROCEDURES IN SCHEMA schema_name [, ...] }
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT USAGE
    ON LANGUAGE language_name [, ...]
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]             

GRANT { { ALTER | DROP} [,...] | ALL [ PRIVILEGES ] }
    ON COPY JOB job_name [,...]
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { { ALTER | DROP | USAGE } [,...] | ALL [ PRIVILEGES ] }
    ON TEMPLATE [database_name.][schema_name.]template_name [,...]
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Granting column-level permissions for tables
<a name="grant-column-level"></a>

The following is the syntax for column-level permissions on Amazon Redshift tables and views.

```
GRANT { { SELECT | UPDATE } ( column_name [, ...] ) [, ...] | ALL [ PRIVILEGES ] ( column_name [,...] ) }
     ON { [ TABLE ] table_name [, ...] }

     TO { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Granting ASSUMEROLE permissions
<a name="grant-assumerole-permissions"></a>

The following is the syntax for the ASSUMEROLE permissions granted to users and groups with a specified role. To begin using the ASSUMEROLE privilege, see [Usage notes for granting the ASSUMEROLE permission](r_GRANT-usage-notes.md#r_GRANT-usage-notes-assumerole).

```
GRANT ASSUMEROLE
       ON { 'iam_role' [, ...] | default | ALL }
       TO { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
       FOR { ALL | COPY | UNLOAD | EXTERNAL FUNCTION | CREATE MODEL } [, ...]
```

### Granting permissions for Redshift Spectrum integration with Lake Formation
<a name="grant-spectrum-integration-with-lf-syntax"></a>

The following is the syntax for Redshift Spectrum integration with Lake Formation. 

```
GRANT { SELECT | ALL [ PRIVILEGES ] } ( column_list )
    ON EXTERNAL TABLE schema_name.table_name
    TO { IAM_ROLE iam_role } [, ...] [ WITH GRANT OPTION ]

GRANT { { SELECT | ALTER | DROP | DELETE | INSERT }  [, ...] | ALL [ PRIVILEGES ] }
    ON EXTERNAL TABLE schema_name.table_name [, ...]
    TO { { IAM_ROLE iam_role } [, ...] | PUBLIC } [ WITH GRANT OPTION ]

GRANT { { CREATE | ALTER | DROP }  [, ...] | ALL [ PRIVILEGES ] }
    ON EXTERNAL SCHEMA schema_name [, ...]
    TO { IAM_ROLE iam_role } [, ...] [ WITH GRANT OPTION ]
```

### Granting datashare permissions
<a name="grant-datashare-syntax"></a>

**Producer-side datashare permissions**  
The following is the syntax for using GRANT to grant ALTER or SHARE permissions to a user or role. The user can alter the datashare with the ALTER permission, or grant usage to a consumer with the SHARE permission. ALTER and SHARE are the only permissions that you can grant on a datashare to users and roles.

```
GRANT { ALTER | SHARE } ON DATASHARE datashare_name
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

The following is the syntax for using GRANT for datashare usage permissions on Amazon Redshift. You grant access to a datashare to a consumer using the USAGE permission. You can't grant this permission to users or user groups. This permission also doesn't support the WITH GRANT OPTION for the GRANT statement. Only users or user groups with the SHARE permission previously granted to them FOR the datashare can run this type of GRANT statement.

```
GRANT USAGE
    ON DATASHARE datashare_name
    TO NAMESPACE 'namespaceGUID' | ACCOUNT 'accountnumber' [ VIA DATA CATALOG ]
```

The following is an example of how to grant usage of a datashare to a Lake Formation account.

```
GRANT USAGE ON DATASHARE salesshare TO ACCOUNT '123456789012' VIA DATA CATALOG;
```

**Consumer-side datashare permissions**  
The following is the syntax for GRANT data-sharing usage permissions on a specific database or schema created from a datashare. 

Further permissions required for consumers to access a database created from a datashare vary depending on whether or not the CREATE DATABASE command used to create the database from the datashare used the WITH PERMISSIONS clause. For more information about the CREATE DATABASE command and WITH PERMISSIONS clause, see [CREATE DATABASE](r_CREATE_DATABASE.md).

**Databases created without using the WITH PERMISSIONS clause**  
When you grant USAGE on a database created from a datashare without the WITH PERMISSIONS clause, you don't need to grant permissions separately on the objects in the shared database. Entities granted usage on databases created from datashares without the WITH PERMISSIONS clause automatically have access to all objects in the database.

**Databases created using the WITH PERMISSIONS clause**  
When you grant USAGE on a database where the shared database was created from a datashare with the WITH PERMISSIONS clause, consumer-side identities must still be granted the relevant permissions for database objects in the shared database in order to access them, just as you would grant permissions for local database objects. To grant permissions to objects in a database created from a datashare, use the three-part syntax `database_name.schema_name.object_name`. To grant permissions to objects in an external schema pointing to a shared schema within the shared database, use the two-part syntax `schema_name.object_name`.

```
GRANT USAGE ON { DATABASE shared_database_name [, ...] | SCHEMA shared_schema}
    TO { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Granting scoped permissions
<a name="grant-scoped-syntax"></a>

Scoped permissions let you grant permissions to a user or role on all objects of a type within a database or schema. Users and roles with scoped permissions have the specified permissions on all current and future objects within the database or schema.

You can view the scope of database-level scoped permissions in [SVV\$1DATABASE\$1PRIVILEGES](r_SVV_DATABASE_PRIVILEGES.md). You can view the scope of schema-level scoped permissions in [SVV\$1SCHEMA\$1PRIVILEGES](r_SVV_SCHEMA_PRIVILEGES.md).

For more information about scoped permissions, see [Scoped permissions](t_scoped-permissions.md).

The following is the syntax for granting scoped permissions to users and roles.

```
GRANT { CREATE | USAGE | ALTER | DROP } [,...] | ALL [ PRIVILEGES ] }
FOR SCHEMAS IN
DATABASE db_name 
TO { username [ WITH GRANT OPTION ] | ROLE role_name } [, ...]

GRANT 
{ { SELECT | INSERT | UPDATE | DELETE | DROP | ALTER | TRUNCATE | REFERENCES } [, ...] } | ALL [PRIVILEGES] } }
FOR TABLES IN
{SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
TO { username [ WITH GRANT OPTION ] | ROLE role_name} [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
FOR FUNCTIONS IN 
{SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
TO { username [ WITH GRANT OPTION ] | ROLE role_name | } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
FOR PROCEDURES IN
{SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
TO { username [ WITH GRANT OPTION ] | ROLE role_name | } [, ...]

GRANT USAGE
FOR LANGUAGES IN
{DATABASE db_name}
TO { username [ WITH GRANT OPTION ] | ROLE role_name } [, ...]  

GRANT { { CREATE | ALTER | DROP} [,...] | ALL [ PRIVILEGES ] }
FOR COPY JOBS 
IN DATABASE db_name
TO { username [ WITH GRANT OPTION ] | ROLE role_name } [, ...]

GRANT { { ALTER | DROP | USAGE } [,...] | ALL [ PRIVILEGES ] }
FOR TEMPLATES IN
{SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
TO { username [ WITH GRANT OPTION ] | ROLE role_name } [, ...]
```

Note that scoped permissions don’t distinguish between permissions for functions and for procedures. For example, the following statement grants `bob` the `EXECUTE` permission for both functions and procedures in the schema `Sales_schema`.

```
GRANT EXECUTE FOR FUNCTIONS IN SCHEMA Sales_schema TO bob;
```

### Granting machine learning permissions
<a name="grant-model-syntax"></a>

The following is the syntax for machine learning model permissions on Amazon Redshift.

```
GRANT CREATE MODEL
    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
    ON MODEL model_name [, ...]

    TO { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Granting role permissions
<a name="grant-roles"></a>

The following is the syntax for granting roles on Amazon Redshift.

```
GRANT { ROLE role_name } [, ...] TO { { user_name [ WITH ADMIN OPTION ] } | ROLE role_name }[, ...]
```

The following is the syntax for granting system permissions to roles on Amazon Redshift. Note that you can only grant permissions to roles, not users.

```
GRANT
  {
    { CREATE USER | DROP USER | ALTER USER |
    CREATE SCHEMA | DROP SCHEMA |
    ALTER DEFAULT PRIVILEGES |
    ACCESS CATALOG | ACCESS SYSTEM TABLE
    CREATE TABLE | DROP TABLE | ALTER TABLE |
    CREATE OR REPLACE FUNCTION | CREATE OR REPLACE EXTERNAL FUNCTION |
    DROP FUNCTION |
    CREATE OR REPLACE PROCEDURE | DROP PROCEDURE |
    CREATE OR REPLACE VIEW | DROP VIEW |
    CREATE MODEL | DROP MODEL |
    CREATE DATASHARE | ALTER DATASHARE | DROP DATASHARE |
    CREATE LIBRARY | DROP LIBRARY |
    CREATE ROLE | DROP ROLE |
    TRUNCATE TABLE
    VACUUM | ANALYZE | CANCEL |
    IGNORE RLS | EXPLAIN RLS | 
    EXPLAIN MASKING }[, ...]
  }
  | { ALL [ PRIVILEGES ] }
TO ROLE role_name [, ...]
```

### Granting explain permissions for security policies
<a name="grant-row-level-security"></a>

The following is the syntax for granting permissions to explain the security policy filters of a query in the EXPLAIN plan. Possible security policies include row-level security policies and dynamic data masking policies.

```
GRANT EXPLAIN { RLS | MASKING } TO ROLE rolename 
```

The following is the syntax for granting permissions to bypass row-level security policies for a query. This syntax doesn't apply to dynamic data masking policies.

```
GRANT IGNORE RLS TO ROLE rolename 
```

The following is the syntax for granting lookup table permissions to the specified security policy. Possible security policies include row-level security policies and dynamic data masking policies.

```
GRANT SELECT ON [ TABLE ] table_name [, ...]
TO { RLS | MASKING } POLICY policy_name [, ...]
```

### Granting connection permissions
<a name="grant-connection-permissions"></a>

The following is the syntax for granting permissions for AWS IAM Identity Center federated users (or groups) to connect to a workgroup/cluster:

```
GRANT CONNECT [ON WORKGROUP]
TO [USER] <prefix>:<username> | ROLE <prefix>:<rolename> | PUBLIC;
```

## Parameters
<a name="r_GRANT-parameters"></a>

SELECT   <a name="grant-select"></a>
Grants permission to select data from a table or view using a SELECT statement. The SELECT permission is also required to reference existing column values for UPDATE or DELETE operations.

INSERT   <a name="grant-insert"></a>
Grants permission to load data into a table using an INSERT statement or a COPY statement. 

UPDATE   <a name="grant-update"></a>
Grants permission to update a table column using an UPDATE statement. UPDATE operations also require the SELECT permission, because they must reference table columns to determine which rows to update, or to compute new values for columns.

DELETE  <a name="grant-delete"></a>
Grants permission to delete a data row from a table. DELETE operations also require the SELECT permission, because they must reference table columns to determine which rows to delete.

DROP  <a name="grant-drop"></a>
Depending on the database object, grants the following permissions to the user or role:   
+  For tables, DROP grants permission to drop a table or view. For more information, see [DROP TABLE](r_DROP_TABLE.md). 
+  For databases, DROP grants permission to drop a database. For more information, see [DROP DATABASE](r_DROP_DATABASE.md). 
+  For schemas, DROP grants permission to drop a schema. For more information, see [DROP SCHEMA](r_DROP_SCHEMA.md). 

REFERENCES   <a name="grant-references"></a>
Grants permission to create a foreign key constraint. You need to grant this permission on both the referenced table and the referencing table; otherwise, the user can't create the constraint. 

ALTER  <a name="grant-alter"></a>
Depending on the database object, grants the following permissions to the user or user group:   
+ For tables, ALTER grants permission to alter a table or view. For more information, see [ALTER TABLE](r_ALTER_TABLE.md).
+ For databases, ALTER grants permission to alter a database. For more information, see [ALTER DATABASE](r_ALTER_DATABASE.md).
+ For schemas, ALTER grants permission to alter a schema. For more information, see [ALTER SCHEMA](r_ALTER_SCHEMA.md).
+ For external tables, ALTER grants permission to alter a table in an AWS Glue Data Catalog that is enabled for Lake Formation. This permission only applies when using Lake Formation.

TRUNCATE  <a name="grant-truncate"></a>
Grants permission to truncate a table. Without this permission, only the owner of a table or a superuser can truncate a table. For more information about the TRUNCATE command, see [TRUNCATE](r_TRUNCATE.md).

ALL [ PRIVILEGES ]   <a name="grant-all"></a>
Grants all available permissions at once to the specified user or role. The PRIVILEGES keyword is optional.  
GRANT ALL ON SCHEMA doesn't grant CREATE permissions for external schemas.  
You can grant the ALL permission to a table in an AWS Glue Data Catalog that is enabled for Lake Formation. In this case, individual permissions (such as SELECT, ALTER, and so on) are recorded in the Data Catalog.   
 Amazon Redshift doesn't support the RULE and TRIGGER permissions. For more information, go to [Unsupported PostgreSQL features](c_unsupported-postgresql-features.md). 

ASSUMEROLE  <a name="assumerole"></a>
Grants permission to run COPY, UNLOAD, EXTERNAL FUNCTION, and CREATE MODEL commands to users, roles, or groups with a specified role. The user, role, or group assumes that role when running the specified command. To begin using the ASSUMEROLE permission, see [Usage notes for granting the ASSUMEROLE permission](r_GRANT-usage-notes.md#r_GRANT-usage-notes-assumerole).

ON [ TABLE ] *table\$1name*   <a name="grant-on-table"></a>
Grants the specified permissions on a table or a view. The TABLE keyword is optional. You can list multiple tables and views in one statement.

ON ALL TABLES IN SCHEMA *schema\$1name*   <a name="grant-all-tables"></a>
Grants the specified permissions on all tables and views in the referenced schema.

( *column\$1name* [,...] ) ON TABLE *table\$1name*   <a name="grant-column-level-privileges"></a>
Grants the specified permissions to users, groups, or PUBLIC on the specified columns of the Amazon Redshift table or view.

( *column\$1list* ) ON EXTERNAL TABLE *schema\$1name.table\$1name*   <a name="grant-external-table-column"></a>
Grants the specified permissions to an IAM role on the specified columns of the Lake Formation table in the referenced schema.

ON EXTERNAL TABLE *schema\$1name.table\$1name*   <a name="grant-external-table"></a>
Grants the specified permissions to an IAM role on the specified Lake Formation tables in the referenced schema.

ON EXTERNAL SCHEMA *schema\$1name*   <a name="grant-external-schema"></a>
Grants the specified permissions to an IAM role on the referenced schema.

ON *iam\$1role*   <a name="grant-iam_role"></a>
Grants the specified permissions to an IAM role.

TO *username*   <a name="grant-to"></a>
Indicates the user receiving the permissions.

TO IAM\$1ROLE *iam\$1role*   <a name="grant-to-iam-role"></a>
Indicates the IAM role receiving the permissions.

WITH GRANT OPTION   <a name="grant-with-grant"></a>
Indicates that the user receiving the permissions can in turn grant the same permissions to others. WITH GRANT OPTION can't be granted to a group or to PUBLIC.

ROLE *role\$1name*   <a name="grant-role"></a>
Grants the permissions to a role.

GROUP *group\$1name*   <a name="grant-group"></a>
Grants the permissions to a user group. Can be a comma-separated list to specify multiple user groups.

PUBLIC   <a name="grant-public"></a>
Grants the specified permissions to all users, including users created later. PUBLIC represents a group that always includes all users. An individual user's permissions consist of the sum of permissions granted to PUBLIC, permissions granted to any groups that the user belongs to, and any permissions granted to the user individually.  
Granting PUBLIC to a Lake Formation EXTERNAL TABLE results in granting the permission to the Lake Formation *everyone* group.

CONNECT [ON WORKGROUP] TO \$1 [USER] <prefix>:<username> \$1 ROLE <prefix>:<rolename> \$1 PUBLIC \$1  
Grants the permission to connect to a workgroup or cluster to AWS IAM Identity Center federated users or groups. The prefix identifies the identity provider. When granted to PUBLIC, the permission applies to all AWS IAM Identity Center federated users, including users created later. This permission is applicable only when Amazon Redshift Federated Permissions are enabled on the workgroup or cluster.

CREATE   <a name="grant-create"></a>
Depending on the database object, grants the following permissions to the user or user group:  
+ For databases, CREATE allows users to create schemas within the database.
+ For schemas, CREATE allows users to create objects within a schema. To rename an object, the user must have the CREATE permission and own the object to be renamed.
+ CREATE ON SCHEMA isn't supported for Amazon Redshift Spectrum external schemas. To grant usage of external tables in an external schema, grant USAGE ON SCHEMA to the users that need access. Only the owner of an external schema or a superuser is permitted to create external tables in the external schema. To transfer ownership of an external schema, use [ALTER SCHEMA](r_ALTER_SCHEMA.md) to change the owner. 

TEMPORARY \$1 TEMP   <a name="grant-temporary"></a>
Grants the permission to create temporary tables in the specified database. To run Amazon Redshift Spectrum queries, the database user must have permission to create temporary tables in the database.   
By default, users are granted permission to create temporary tables by their automatic membership in the PUBLIC group. To remove the permission for any users to create temporary tables, revoke the TEMP permission from the PUBLIC group. Then explicitly grant the permission to create temporary tables to specific users or groups of users.

ON DATABASE *db\$1name*   <a name="grant-database"></a>
Grants the specified permissions on a database.

USAGE   <a name="grant-usage"></a>
Grants USAGE permission on a specific schema, which makes objects in that schema accessible to users. Specific actions on these objects must be granted separately (for example, SELECT or UPDATE permission on tables) for local Amazon Redshift schemas. By default, all users have CREATE and USAGE permission on the PUBLIC schema.   
 When you grant USAGE to external schemas using ON SCHEMA syntax, you don't need to grant actions separately on the objects in the external schema. The corresponding catalog permissions control granular permissions on the external schema objects. 

ON SCHEMA *schema\$1name*   <a name="grant-schema"></a>
Grants the specified permissions on a schema.  
GRANT CREATE ON SCHEMA and the CREATE permission in GRANT ALL ON SCHEMA aren't supported for Amazon Redshift Spectrum external schemas. To grant usage of external tables in an external schema, grant USAGE ON SCHEMA to the users that need access. Only the owner of an external schema or a superuser is permitted to create external tables in the external schema. To transfer ownership of an external schema, use [ALTER SCHEMA](r_ALTER_SCHEMA.md) to change the owner. 

EXECUTE ON ALL FUNCTIONS IN SCHEMA *schema\$1name*  <a name="grant-all-functions"></a>
Grants the specified permissions on all functions in the referenced schema.  
Amazon Redshift doesn't support GRANT or REVOKE statements for pg\$1proc builtin entries defined in pg\$1catalog namespace. 

EXECUTE ON PROCEDURE *procedure\$1name*   <a name="grant-procedure"></a>
Grants the EXECUTE permission on a specific stored procedure. Because stored procedure names can be overloaded, you must include the argument list for the procedure. For more information, see [Naming stored procedures](stored-procedure-naming.md).

EXECUTE ON ALL PROCEDURES IN SCHEMA *schema\$1name*  <a name="grant-all-procedures"></a>
Grants the specified permissions on all stored procedures in the referenced schema.

USAGE ON LANGUAGE *language\$1name*   
Grants the USAGE permission on a language.   
Starting November 1, 2025, Amazon Redshift will no longer support the creation of new Python UDFs. Existing Python UDFs will continue to function until June 30, 2026. Starting July 1, 2026, Amazon Redshift will no longer support Python UDFs. We recommend that you migrate your existing Python UDFs to Lambda UDFs before November 1, 2025. For information on creating and using Lambda UDFs, see [Scalar Lambda UDFs](udf-creating-a-lambda-sql-udf.md). For information on converting existing Python UDFs to Lambda UDFs, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/).
The USAGE ON LANGUAGE permission is required to create user-defined functions (UDFs) by running the [CREATE FUNCTION](r_CREATE_FUNCTION.md) command. For more information, see [UDF security and permissions](udf-security-and-privileges.md).   
The USAGE ON LANGUAGE permission is required to create stored procedures by running the [CREATE PROCEDURE](r_CREATE_PROCEDURE.md) command. For more information, see [Security and privileges for stored procedures](stored-procedure-security-and-privileges.md).  
For Python UDFs, use `plpythonu`. For SQL UDFs, use `sql`. For stored procedures, use `plpgsql`.

ON COPY JOB *job\$1name*  <a name="on-copy-job"></a>
Grants the specified permissions on a copy job.

FOR \$1 ALL \$1 COPY \$1 UNLOAD \$1 EXTERNAL FUNCTION \$1 CREATE MODEL \$1 [, ...]   <a name="grant-for"></a>
Specifies the SQL command for which the permission is granted. You can specify ALL to grant the permission on the COPY, UNLOAD, EXTERNAL FUNCTION, and CREATE MODEL statements. This clause applies only to granting the ASSUMEROLE permission.

ALTER  
Grants the ALTER permission to users to add or remove objects from a datashare, or to set the property PUBLICACCESSIBLE. For more information, see [ALTER DATASHARE](r_ALTER_DATASHARE.md).

SHARE  
Grants pemrissions to users and user groups to add data consumers to a datashare. This permission is required to enable the particular consumer (account or namespace) to access the datashare from their clusters. The consumer can be the same or a different AWS account, with the same or a different cluster namespace as specified by a globally unique identifier (GUID).

ON DATASHARE *datashare\$1name*   <a name="grant-datashare"></a>
Grants the specified permissions on the referenced datashare. For information about consumer access control granularity, see [Data sharing at different levels in Amazon Redshift](datashare-overview.md#granularity).

USAGE  
When USAGE is granted to a consumer account or namespace within the same account, the specific consumer account or namespace within the account can access the datashare and the objects of the datashare in read-only fashion. 

TO NAMESPACE 'clusternamespace GUID'  
Indicates a namespace in the same account where consumers can receive the specified permissions to the datashare. Namespaces use a 128-bit alphanumeric GUID.

TO ACCOUNT 'accountnumber' [ VIA DATA CATALOG ]  
Indicates the number of another account whose consumers can receive the specified permissions to the datashare. Specifying ‘VIA DATA CATALOG’ indicates that you are granting usage of the datashare to a Lake Formation account. Omitting this parameter means you're granting usage to an account that owns the cluster.

ON DATABASE *shared\$1database\$1name> [, ...]*   <a name="grant-datashare"></a>
Grants the specified usage permissions on the specified database that is created in the specified datashare.

ON SCHEMA* shared\$1schema*   <a name="grant-datashare"></a>
Grants the specified permissions on the specified schema that is created in the specified datashare.

FOR \$1 SCHEMAS \$1 TABLES \$1 FUNCTIONS \$1 PROCEDURES \$1 LANGUAGES \$1 COPY JOBS\$1 IN   
Specifies the database objects to grant permission to. The parameters following IN define the scope of the granted permission.

CREATE MODEL  
Grants the CREATE MODEL permission to specific users or user groups.

ON MODEL *model\$1name*  
Grants the EXECUTE permission on a specific model. 

ACCESS CATALOG  
Grants the permission to view relevant metadata of objects that the role has access to.

\$1 role \$1 [, ...]  
The role to be granted to another role, a user, or PUBLIC.  
PUBLIC represents a group that always includes all users. An individual user's permissions consist of the sum of permissions granted to PUBLIC, permissions granted to any groups that the user belongs to, and any permissions granted to the user individually.

TO \$1 \$1 *user\$1name* [ WITH ADMIN OPTION ] \$1 \$1 role \$1[, ...]  
Grants the specified role to a specified user with the WITH ADMIN OPTION, another role, or PUBLIC.  
The WITH ADMIN OPTION clause provides the administration options for all the granted roles to all the grantees. 

EXPLAIN \$1 RLS \$1 MASKING \$1 TO ROLE *rolename*  
Grants the permission to explain the security policy filters of a query in the EXPLAIN plan to a role. RLS grants permission to explain row-level security policy filters. MASKING grants permission to explain dynamic data masking policy filters.

IGNORE RLS TO ROLE *rolename*   
Grants the permission to bypass row-level security policies for a query to a role.

TO \$1 RLS \$1 MASKING \$1 POLICY *policy\$1name*  
Indicates the security policy receiving the permissions. TO RLS POLICY indicates a row-level security policy. TO MASKING POLICY indicates a dynamic data masking policy.

## Usage notes
<a name="r_GRANT-usage-notes-link"></a>

To learn more about the usage notes for GRANT, see [Usage notes](r_GRANT-usage-notes.md).

## Examples
<a name="r_GRANT-examples-link"></a>

For examples of how to use GRANT, see [Examples](r_GRANT-examples.md).

# Usage notes
<a name="r_GRANT-usage-notes"></a>

To grant privileges on an object, you must meet one of the following criteria:
+ Be the object owner.
+ Be a superuser.
+ Have a grant privilege for that object and privilege.

For example, the following command enables the user HR both to perform SELECT commands on the employees table and to grant and revoke the same privilege for other users.

```
grant select on table employees to HR with grant option;
```

HR can't grant privileges for any operation other than SELECT, or on any other table than employees. 

As another example, the following command enables the user HR both to perform ALTER commands on the employees table and to grant and revoke the same privilege for other users.

```
grant ALTER on table employees to HR with grant option;
```

HR can't grant privileges for any operation other than ALTER, or on any other table than employees. 

Having privileges granted on a view doesn't imply having privileges on the underlying tables. Similarly, having privileges granted on a schema doesn't imply having privileges on the tables in the schema. Instead, grant access to the underlying tables explicitly.

To grant privileges to an AWS Lake Formation table, the IAM role associated with the table's external schema must have permission to grant privileges to the external table. The following example creates an external schema with an associated IAM role `myGrantor`. The IAM role `myGrantor` has the permission to grant permissions to others. The GRANT command uses the permission of the IAM role `myGrantor` that is associated with the external schema to grant permission to the IAM role `myGrantee`.

```
create external schema mySchema
from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::123456789012:role/myGrantor'
create external database if not exists;
```

```
grant select
on external table mySchema.mytable
to iam_role 'arn:aws:iam::123456789012:role/myGrantee';
```

If you GRANT ALL privileges to an IAM role, individual privileges are granted in the related Lake Formation–enabled Data Catalog. For example, the following GRANT ALL results in the granted individual privileges (SELECT, ALTER, DROP, DELETE, and INSERT) showing in the Lake Formation console.

```
grant all
on external table mySchema.mytable
to iam_role 'arn:aws:iam::123456789012:role/myGrantee';
```

Superusers can access all objects regardless of GRANT and REVOKE commands that set object privileges.

## Usage notes for column-level access control
<a name="r_GRANT-usage-notes-clp"></a>

The following usage notes apply to column-level privileges on Amazon Redshift tables and views. These notes describe tables; the same notes apply to views unless we explicitly note an exception. 
+ For an Amazon Redshift table, you can grant only the SELECT and UPDATE privileges at the column level. For an Amazon Redshift view, you can grant only the SELECT privilege at the column level. 
+ The ALL keyword is a synonym for SELECT and UPDATE privileges combined when used in the context of a column-level GRANT on a table. 
+ If you don't have the SELECT privilege on all columns in a table, performing a SELECT \$1 operation returns only those columns that you have access to. When using a view, a SELECT \$1 operation attempts to access all columns in the view. If you do not have permission to access all columns, these queries fail with a permission denied error.
+ SELECT \$1 doesn't expand to only accessible columns in the following cases:
  + You can't create a regular view with only accessible columns using SELECT \$1.
  + You can't create a materialized view with only accessible columns using SELECT \$1.
+ If you have SELECT or UPDATE privilege on a table or view and add a column, you still have the same privileges on the table or view and thus all its columns. 
+ Only a table's owner or a superuser can grant column-level privileges. 
+ The WITH GRANT OPTION clause isn't supported for column-level privileges.
+ You can't hold the same privilege at both the table level and the column level. For example, the user `data_scientist` can't have both SELECT privilege on the table `employee` and SELECT privilege on the column `employee.department`. Consider the following results when granting the same privilege to a table and a column within the table:
  + If a user has a table-level privilege on a table, then granting the same privilege at the column level has no effect. 
  + If a user has a table-level privilege on a table, then revoking the same privilege for one or more columns of the table returns an error. Instead, revoke the privilege at the table level. 
  + If a user has a column-level privilege, then granting the same privilege at the table level returns an error. 
  + If a user has a column-level privilege, then revoking the same privilege at the table level revokes both column and table privileges for all columns on the table. 
+ You can't grant column-level privileges on late-binding views.
+ To create a materialized view, you must have table-level SELECT privilege on the base tables. Even if you have column-level privileges on specific columns, you can't create a materialized view on only those columns. However, you can grant SELECT privilege to columns of a materialized view, similar to regular views. 
+ To look up grants of column-level privileges, use the [PG\$1ATTRIBUTE\$1INFO](r_PG_ATTRIBUTE_INFO.md) view. 

## Usage notes for granting the ASSUMEROLE permission
<a name="r_GRANT-usage-notes-assumerole"></a>

The following usage notes apply to granting the ASSUMEROLE permission in Amazon Redshift. 

You use the ASSUMEROLE permission to control IAM role access permissions for database users, roles, or groups on commands such as COPY, UNLOAD, EXTERNAL FUNCTION, or CREATE MODEL. After you grant the ASSUMEROLE permission to a user, role, or group for an IAM role, the user, role, or group can assume that role when running the command. The ASSUMEROLE permission enables you to grant access to the appropriate commands as required.

Only a database superuser can grant or revoke the ASSUMEROLE permission for users, roles, and groups. A superuser always retains the ASSUMEROLE permission.

To enable the use of the ASSUMEROLE permission for users, roles, and groups, a superuser performs the following two actions:
+ Run the following statement once on the cluster:

  ```
  revoke assumerole on all from public for all;
  ```
+ Grant the ASSUMEROLE permission to users, roles, and groups for the appropriate commands.

You can specify role chaining in the ON clause when granting the ASSUMEROLE permission. You use commas to separate roles in a role chain, for example, `Role1,Role2,Role3`. If role chaining was specified when granting the ASSUMEROLE permission, you must specify the role chain when performing operations granted by the ASSUMEROLE permission. You can't specify individual roles within the role chain when performing operations granted by the ASSUMEROLE permission. For example, if a user, role, or group is granted the role chain `Role1,Role2,Role3`, you can't specify only `Role1` to perform operations. 

If a user attempts to perform a COPY, UNLOAD, EXTERNAL FUNCTION, or CREATE MODEL operation and hasn't been granted the ASSUMEROLE permission, a message similar to the following appears.

```
ERROR:  User awsuser does not have ASSUMEROLE permission on IAM role "arn:aws:iam::123456789012:role/RoleA" for COPY 
```

To list users that have been granted access to IAM roles and commands through the ASSUMEROLE permission, see [HAS\$1ASSUMEROLE\$1PRIVILEGE](r_HAS_ASSUMEROLE_PRIVILEGE.md). To list IAM roles and command permissions that have been granted to a user that you specify, see [PG\$1GET\$1IAM\$1ROLE\$1BY\$1USER](PG_GET_IAM_ROLE_BY_USER.md). To list users, roles, and groups that have been granted access to an IAM role that you specify, see [PG\$1GET\$1GRANTEE\$1BY\$1IAM\$1ROLE](PG_GET_GRANTEE_BY_IAMROLE.md).

## Usage notes for granting machine learning permissions
<a name="r_GRANT-usage-notes-create-model"></a>

You can't directly grant or revoke permissions related to an ML function. An ML function belongs to an ML model and permissions are controlled through the model. Instead, you can grant permissions related to the ML model. The following example demonstrates how to grant permisisons to all users to run the ML function associated with the model `customer_churn`.

```
GRANT EXECUTE ON MODEL customer_churn TO PUBLIC;
```

You can also grant all permissions to a user for the ML model `customer_churn`.

```
GRANT ALL on MODEL customer_churn TO ml_user;
```

Granting the `EXECUTE` permission related to an ML function will fail if there is an ML function in the schema, even if that ML function already has the `EXECUTE` permission through `GRANT EXECUTE ON MODEL`. We recommend using a separate schema when using the `CREATE MODEL` command to keep the ML functions in a separate schema by themselves. The following example demonstrates how to do so.

```
CREATE MODEL ml_schema.customer_churn
FROM customer_data
TARGET churn
FUNCTION ml_schema.customer_churn_prediction
IAM_ROLE default
SETTINGS (
 S3_BUCKET 'amzn-s3-demo-bucket'
);
```

# Examples
<a name="r_GRANT-examples"></a>

 The following example grants the SELECT privilege on the SALES table to the user `fred`. 

```
grant select on table sales to fred;
```

The following example grants the SELECT privilege on all tables in the QA\$1TICKIT schema to the user `fred`. 

```
grant select on all tables in schema qa_tickit to fred;
```

The following example grants all schema privileges on the schema QA\$1TICKIT to the user group QA\$1USERS. Schema privileges are CREATE and USAGE. USAGE grants users access to the objects in the schema, but doesn't grant privileges such as INSERT or SELECT on those objects. Grant privileges on each object separately.

```
create group qa_users;
grant all on schema qa_tickit to group qa_users;
```

The following example grants all privileges on the SALES table in the QA\$1TICKIT schema to all users in the group QA\$1USERS.

```
grant all on table qa_tickit.sales to group qa_users;
```

The following example grants all privileges on the SALES table in the QA\$1TICKIT schema to all users in the groups QA\$1USERS and RO\$1USERS.

```
grant all on table qa_tickit.sales to group qa_users, group ro_users;
```

The following example grants the DROP privilege on the SALES table in the QA\$1TICKIT schema to all users in the group QA\$1USERS.

```
grant drop on table qa_tickit.sales to group qa_users;>
```

The following sequence of commands shows how access to a schema doesn't grant privileges on a table in the schema. 

```
create user schema_user in group qa_users password 'Abcd1234';
create schema qa_tickit;
create table qa_tickit.test (col1 int);
grant all on schema qa_tickit to schema_user;

set session authorization schema_user;
select current_user;


current_user
--------------
schema_user
(1 row)


select count(*) from qa_tickit.test;


ERROR: permission denied for relation test [SQL State=42501]


set session authorization dw_user;
grant select on table qa_tickit.test to schema_user;
set session authorization schema_user;
select count(*) from qa_tickit.test;


count
-------
0
(1 row)
```

The following sequence of commands shows how access to a view doesn't imply access to its underlying tables. The user called VIEW\$1USER can't select from the DATE table, although this user has been granted all privileges on VIEW\$1DATE. 

```
create user view_user password 'Abcd1234';
create view view_date as select * from date;
grant all on view_date to view_user;
set session authorization view_user;
select current_user;


current_user
--------------
view_user
(1 row)


select count(*) from view_date;


count
-------
365
(1 row)


select count(*) from date;


ERROR:  permission denied for relation date
```

The following example grants SELECT privilege on the `cust_name` and `cust_phone` columns of the `cust_profile` table to the user `user1`. 

```
grant select(cust_name, cust_phone) on cust_profile to user1;
```

The following example grants SELECT privilege on the `cust_name` and `cust_phone` columns and UPDATE privilege on the `cust_contact_preference` column of the `cust_profile` table to the `sales_group` group. 

```
grant select(cust_name, cust_phone), update(cust_contact_preference) on cust_profile to group sales_group;
```

The following example shows the usage of the ALL keyword to grant both SELECT and UPDATE privileges on three columns of the table `cust_profile` to the `sales_admin` group. 

```
grant ALL(cust_name, cust_phone,cust_contact_preference) on cust_profile to group sales_admin;
```

The following example grants the SELECT privilege on the `cust_name` column of the `cust_profile_vw` view to the `user2` user. 

```
grant select(cust_name) on cust_profile_vw to user2;
```

## Examples of granting access to datashares
<a name="r_GRANT-examples-datashare"></a>

The following examples show GRANT datasharing usage permissions on a specific database or schema created from a datashare. 

In the following example, a producer-side admin grants the USAGE permission on the `salesshare` datashare to the specified namespace. 

```
GRANT USAGE ON DATASHARE salesshare TO NAMESPACE '13b8833d-17c6-4f16-8fe4-1a018f5ed00d';
```

In the following example, a consumer-side admin grants the USAGE permission on the `sales_db` to `Bob`.

```
GRANT USAGE ON DATABASE sales_db TO Bob;
```

In the following example, a consumer-side admin grants the GRANT USAGE permission on the `sales_schema` schema to the `Analyst_role` role. `sales_schema` is an external schema that points to sales\$1db.

```
GRANT USAGE ON SCHEMA sales_schema TO ROLE Analyst_role;
```

At this point, `Bob` and `Analyst_role` can access all database objects in `sales_schema` and `sales_db`.

The following example shows granting additional object-level permission for objects in a shared database. These extra permissions are only necessary if the CREATE DATABASE command that was used to create the shared database used the WITH PERMISSIONS clause. If the CREATE DATABASE command didn’t use WITH PERMISSIONS, granting USAGE on the shared database grants full access to all objects in that database.

```
GRANT SELECT ON sales_db.sales_schema.tickit_sales_redshift to Bob;
```

## Examples of granting scoped permissions
<a name="r_GRANT-examples-scoped"></a>

The following example grants usage for all current and future schemas in the `Sales_db` database to the `Sales` role.

```
GRANT USAGE FOR SCHEMAS IN DATABASE Sales_db TO ROLE Sales;
```

The following example grants the SELECT permission for all current and future tables in the `Sales_db` database to the user `alice`, and also gives `alice` the permission to grant scoped permissions on tables in `Sales_db` to other users.

```
GRANT SELECT FOR TABLES IN DATABASE Sales_db TO alice WITH GRANT OPTION;
```

The following example grants the EXECUTE permission for functions in the `Sales_schema` schema to the user `bob`.

```
GRANT EXECUTE FOR FUNCTIONS IN SCHEMA Sales_schema TO bob;
```

The following example grants all permissions for all tables in the `ShareDb` database’s `ShareSchema` schema to the `Sales` role. When specifying the schema, you can specify the schema’s database using the two-part format `database.schema`.

```
GRANT ALL FOR TABLES IN SCHEMA ShareDb.ShareSchema TO ROLE Sales;
```

The following example is the same as the preceding one. You can specify the database using the `DATABASE` keyword instead of using a two-part format.

```
GRANT ALL FOR TABLES IN SCHEMA ShareSchema DATABASE ShareDb TO ROLE Sales;
```

## Examples of granting the ASSUMEROLE privilege
<a name="r_GRANT-examples-assumerole"></a>

The following are examples of granting the ASSUMEROLE privilege.

The following example shows the REVOKE statement that a superuser runs once on the cluster to enable the use of the ASSUMEROLE privilege for users and groups. Then, the superuser grants the ASSUMEROLE privilege to users and groups for the appropriate commands. For information on enabling the use of the ASSUMEROLE privilege for users and groups, see [Usage notes for granting the ASSUMEROLE permission](r_GRANT-usage-notes.md#r_GRANT-usage-notes-assumerole).

```
revoke assumerole on all from public for all;
```

The following example grants the ASSUMEROLE privilege to the user `reg_user1` for the IAM role `Redshift-S3-Read` to perform COPY operations. 

```
grant assumerole on 'arn:aws:iam::123456789012:role/Redshift-S3-Read'
to reg_user1 for copy;
```

The following example grants the ASSUMEROLE privilege to the user `reg_user1` for the IAM role chain `RoleA`, `RoleB` to perform UNLOAD operations. 

```
grant assumerole
on 'arn:aws:iam::123456789012:role/RoleA,arn:aws:iam::210987654321:role/RoleB'
to reg_user1
for unload;
```

The following is an example of the UNLOAD command using the IAM role chain `RoleA`, `RoleB`.

```
unload ('select * from venue limit 10')
to 's3://companyb/redshift/venue_pipe_'
iam_role 'arn:aws:iam::123456789012:role/RoleA,arn:aws:iam::210987654321:role/RoleB';
```

The following example grants the ASSUMEROLE privilege to the user `reg_user1` for the IAM role `Redshift-Exfunc` to create external functions. 

```
grant assumerole on 'arn:aws:iam::123456789012:role/Redshift-Exfunc'
to reg_user1 for external function;
```

The following example grants the ASSUMEROLE privilege to the user `reg_user1` for the IAM role `Redshift-model` to create machine learning models. 

```
grant assumerole on 'arn:aws:iam::123456789012:role/Redshift-ML'
to reg_user1 for create model;
```

## Examples of granting the ROLE privileges
<a name="r_GRANT-examples-role"></a>

The following example grants sample\$1role1 to user1.

```
CREATE ROLE sample_role1;
GRANT ROLE sample_role1 TO user1;
```

The following example grants sample\$1role1 to user1 with the WITH ADMIN OPTION, sets the current session for user1, and user1 grants sample\$1role1 to user2.

```
GRANT ROLE sample_role1 TO user1 WITH ADMIN OPTION;
SET SESSION AUTHORIZATION user1;
GRANT ROLE sample_role1 TO user2;
```

The following example grants sample\$1role1 to sample\$1role2.

```
GRANT ROLE sample_role1 TO ROLE sample_role2;
```

The following example grants sample\$1role2 to sample\$1role3 and sample\$1role4. Then it attempts to grants sample\$1role3 to sample\$1role1.

```
GRANT ROLE sample_role2 TO ROLE sample_role3;
GRANT ROLE sample_role3 TO ROLE sample_role2;
ERROR: cannot grant this role, a circular dependency was detected between these roles
```

The following example grants the CREATE USER system privileges to sample\$1role1.

```
GRANT CREATE USER TO ROLE sample_role1;
```

The following example grants the system-defined role `sys:dba` to user1.

```
GRANT ROLE sys:dba TO user1;
```

The following example attempts to grant sample\$1role3 in a circular dependency to sample\$1role2.

```
CREATE ROLE sample_role3;
GRANT ROLE sample_role2 TO ROLE sample_role3;
GRANT ROLE sample_role3 TO ROLE sample_role2; -- fail
ERROR:  cannot grant this role, a circular dependency was detected between these roles
```

# INSERT
<a name="r_INSERT_30"></a>

**Topics**
+ [Syntax](#r_INSERT_30-synopsis)
+ [Parameters](#r_INSERT_30-parameters)
+ [Usage notes](#r_INSERT_30_usage_notes)
+ [INSERT examples](c_Examples_of_INSERT_30.md)

Inserts new rows into a table. You can insert a single row with the VALUES syntax, multiple rows with the VALUES syntax, or one or more rows defined by the results of a query (INSERT INTO...SELECT).

**Note**  
We strongly encourage you to use the [COPY](r_COPY.md) command to load large amounts of data. Using individual INSERT statements to populate a table might be prohibitively slow. Alternatively, if your data already exists in other Amazon Redshift database tables, use INSERT INTO SELECT or [CREATE TABLE AS](r_CREATE_TABLE_AS.md) to improve performance. For more information about using the COPY command to load tables, see [Loading data in Amazon Redshift](t_Loading_data.md).

**Note**  
The maximum size for a single SQL statement is 16 MB.

## Syntax
<a name="r_INSERT_30-synopsis"></a>

```
INSERT INTO table_name [ ( column [, ...] ) ]
{DEFAULT VALUES |
VALUES ( { expression | DEFAULT } [, ...] )
[, ( { expression | DEFAULT } [, ...] )
[, ...] ] |
query }
```

## Parameters
<a name="r_INSERT_30-parameters"></a>

 *table\$1name*   
A temporary or persistent table. Only the owner of the table or a user with INSERT privilege on the table can insert rows. If you use the *query* clause to insert rows, you must have SELECT privilege on the tables named in the query.   
Use INSERT (external table) to insert results of a SELECT query into existing tables on external catalog. For more information, see [INSERT (external table)](r_INSERT_external_table.md).

 *column*   
You can insert values into one or more columns of the table. You can list the target column names in any order. If you don't specify a column list, the values to be inserted must correspond to the table columns in the order in which they were declared in the CREATE TABLE statement. If the number of values to be inserted is less than the number of columns in the table, the first *n* columns are loaded.   
Either the declared default value or a null value is loaded into any column that isn't listed (implicitly or explicitly) in the INSERT statement. 

DEFAULT VALUES   
If the columns in the table were assigned default values when the table was created, use these keywords to insert a row that consists entirely of default values. If any of the columns don't have default values, nulls are inserted into those columns. If any of the columns are declared NOT NULL, the INSERT statement returns an error. 

VALUES   
Use this keyword to insert one or more rows, each row consisting of one or more values. The VALUES list for each row must align with the column list. To insert multiple rows, use a comma delimiter between each list of expressions. Do not repeat the VALUES keyword. All VALUES lists for a multiple-row INSERT statement must contain the same number of values. 

 *expression*   
A single value or an expression that evaluates to a single value. Each value must be compatible with the data type of the column where it is being inserted. If possible, a value whose data type doesn't match the column's declared data type is automatically converted to a compatible data type. For example:   
+ A decimal value `1.1` is inserted into an INT column as `1`. 
+ A decimal value `100.8976` is inserted into a DEC(5,2) column as `100.90`. 
You can explicitly convert a value to a compatible data type by including type cast syntax in the expression. For example, if column COL1 in table T1 is a CHAR(3) column:   

```
insert into t1(col1) values('Incomplete'::char(3));
```
This statement inserts the value `Inc` into the column.   
For a single-row INSERT VALUES statement, you can use a scalar subquery as an expression. The result of the subquery is inserted into the appropriate column.   
Subqueries aren't supported as expressions for multiple-row INSERT VALUES statements. 

DEFAULT   
Use this keyword to insert the default value for a column, as defined when the table was created. If no default value exists for a column, a null is inserted. You can't insert a default value into a column that has a NOT NULL constraint if that column doesn't have an explicit default value assigned to it in the CREATE TABLE statement. 

 *query*   
Insert one or more rows into the table by defining any query. All of the rows that the query produces are inserted into the table. The query must return a column list that is compatible with the columns in the table, but the column names don't have to match. 

## Usage notes
<a name="r_INSERT_30_usage_notes"></a>

**Note**  
We strongly encourage you to use the [COPY](r_COPY.md) command to load large amounts of data. Using individual INSERT statements to populate a table might be prohibitively slow. Alternatively, if your data already exists in other Amazon Redshift database tables, use INSERT INTO SELECT or [CREATE TABLE AS](r_CREATE_TABLE_AS.md) to improve performance. For more information about using the COPY command to load tables, see [Loading data in Amazon Redshift](t_Loading_data.md).

The data format for the inserted values must match the data format specified by the CREATE TABLE definition. 

 After inserting a large number of new rows into a table: 
+ Vacuum the table to reclaim storage space and re-sort rows. 
+ Analyze the table to update statistics for the query planner. 

When values are inserted into DECIMAL columns and they exceed the specified scale, the loaded values are rounded up as appropriate. For example, when a value of `20.259` is inserted into a DECIMAL(8,2) column, the value that is stored is `20.26`. 

You can insert into a GENERATED BY DEFAULT AS IDENTITY column. You can update columns defined as GENERATED BY DEFAULT AS IDENTITY with values that you supply. For more information, see [GENERATED BY DEFAULT AS IDENTITY](r_CREATE_TABLE_NEW.md#identity-generated-bydefault-clause). 

# INSERT examples
<a name="c_Examples_of_INSERT_30"></a>

The CATEGORY table in the TICKIT database contains the following rows: 

```
 catid | catgroup |  catname  |                  catdesc
-------+----------+-----------+--------------------------------------------
     1 | Sports   | MLB       | Major League Baseball
     2 | Sports   | NHL       | National Hockey League
     3 | Sports   | NFL       | National Football League
     4 | Sports   | NBA       | National Basketball Association
     5 | Sports   | MLS       | Major League Soccer
     6 | Shows    | Musicals  | Musical theatre
     7 | Shows    | Plays     | All non-musical theatre
     8 | Shows    | Opera     | All opera and light opera
     9 | Concerts | Pop       | All rock and pop music concerts
    10 | Concerts | Jazz      | All jazz singers and bands
    11 | Concerts | Classical | All symphony, concerto, and choir concerts
(11 rows)
```

 Create a CATEGORY\$1STAGE table with a similar schema to the CATEGORY table but define default values for the columns: 

```
create table category_stage
(catid smallint default 0,
catgroup varchar(10) default 'General',
catname varchar(10) default 'General',
catdesc varchar(50) default 'General');
```

The following INSERT statement selects all of the rows from the CATEGORY table and inserts them into the CATEGORY\$1STAGE table. 

```
insert into category_stage
(select * from category);
```

The parentheses around the query are optional.

This command inserts a new row into the CATEGORY\$1STAGE table with a value specified for each column in order: 

```
insert into category_stage values
(12, 'Concerts', 'Comedy', 'All stand-up comedy performances');
```

You can also insert a new row that combines specific values and default values: 

```
insert into category_stage values
(13, 'Concerts', 'Other', default);
```

Run the following query to return the inserted rows: 

```
select * from category_stage
where catid in(12,13) order by 1;

 catid | catgroup | catname |             catdesc
-------+----------+---------+----------------------------------
    12 | Concerts | Comedy  | All stand-up comedy performances
    13 | Concerts | Other   | General
(2 rows)
```

The following examples show some multiple-row INSERT VALUES statements. The first example inserts specific CATID values for two rows and default values for the other columns in both rows. 

```
insert into category_stage values
(14, default, default, default),
(15, default, default, default);

select * from category_stage where catid in(14,15) order by 1;
 catid | catgroup | catname | catdesc
-------+----------+---------+---------
    14 | General  | General | General
    15 | General  | General | General
(2 rows)
```

The next example inserts three rows with various combinations of specific and default values: 

```
insert into category_stage values
(default, default, default, default),
(20, default, 'Country', default),
(21, 'Concerts', 'Rock', default);

select * from category_stage where catid in(0,20,21) order by 1;
 catid | catgroup | catname | catdesc
-------+----------+---------+---------
     0 | General  | General | General
    20 | General  | Country | General
    21 | Concerts | Rock    | General
(3 rows)
```

The first set of VALUES in this example produces the same results as specifying DEFAULT VALUES for a single-row INSERT statement.

The following examples show INSERT behavior when a table has an IDENTITY column. First, create a new version of the CATEGORY table, then insert rows into it from CATEGORY: 

```
create table category_ident
(catid int identity not null,
catgroup varchar(10) default 'General',
catname varchar(10) default 'General',
catdesc varchar(50) default 'General');


insert into category_ident(catgroup,catname,catdesc)
select catgroup,catname,catdesc from category;
```

Note that you can't insert specific integer values into the CATID IDENTITY column. IDENTITY column values are automatically generated.

The following example demonstrates that subqueries can't be used as expressions in multiple-row INSERT VALUES statements: 

```
insert into category(catid) values
((select max(catid)+1 from category)),
((select max(catid)+2 from category));

ERROR: can't use subqueries in multi-row VALUES
```

The following example shows an insert into a temporary table populated with data from the `venue` table using the `WITH SELECT` clause. For more information about the `venue` table, see [Sample database](c_sampledb.md).

First, create the temporary table `#venuetemp`.

```
CREATE TABLE #venuetemp AS SELECT * FROM venue;
```

List the rows in the `#venuetemp` table.

```
SELECT * FROM #venuetemp ORDER BY venueid;
         
venueid | venuename                | venuecity  | venuestate| venueseats
--------+--------------------------+------------+-----------+------------
1        Toyota Park                Bridgeview   IL          0	
2        Columbus Crew Stadium      Columbus     OH          0	
3        RFK Stadium                Washington   DC          0	
4        CommunityAmerica Ballpark  Kansas City  KS          0	
5        Gillette Stadium           Foxborough   MA          68756	
...
```

Insert 10 duplicate rows in the `#venuetemp` table using the `WITH SELECT` clause.

```
INSERT INTO #venuetemp (WITH venuecopy AS (SELECT * FROM venue) SELECT * FROM venuecopy ORDER BY 1 LIMIT 10);
```

List the rows in the `#venuetemp` table.

```
SELECT * FROM #venuetemp ORDER BY venueid;
         
venueid | venuename                | venuecity  | venuestate| venueseats
--------+--------------------------+------------+-----------+------------
1        Toyota Park                Bridgeview   IL          0	
1        Toyota Park                Bridgeview   IL          0	
2        Columbus Crew Stadium      Columbus     OH          0	
2        Columbus Crew Stadium      Columbus     OH          0	
3        RFK Stadium                Washington   DC          0
3        RFK Stadium                Washington   DC          0	
4        CommunityAmerica Ballpark  Kansas City  KS          0	
4        CommunityAmerica Ballpark  Kansas City  KS          0	
5        Gillette Stadium           Foxborough   MA          68756
5        Gillette Stadium           Foxborough   MA          68756
...
```

# INSERT (external table)
<a name="r_INSERT_external_table"></a>

Inserts the results of a SELECT query into existing external tables on external catalog such as for AWS Glue, AWS Lake Formation, or an Apache Hive metastore. Use the same AWS Identity and Access Management (IAM) role used for the CREATE EXTERNAL SCHEMA command to interact with external catalogs and Amazon S3.

For nonpartitioned tables, the INSERT (external table) command writes data to the Amazon S3 location defined in the table, based on the specified table properties and file format.

For partitioned tables, INSERT (external table) writes data to the Amazon S3 location according to the partition key specified in the table. It also automatically registers new partitions in the external catalog after the INSERT operation completes.

You can't run INSERT (external table) within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 

## Syntax
<a name="r_INSERT_external_table-synopsis"></a>

```
INSERT INTO external_schema.table_name
{ select_statement }
```

## Parameters
<a name="r_INSERT_external_table-parameters"></a>

 *external\$1schema.table\$1name*   
The name of an existing external schema and a target external table to insert into.

 *select\$1statement*   
A statement that inserts one or more rows into the external table by defining any query. All of the rows that the query produces are written to Amazon S3 in either text or Parquet format based on the table definition. The query must return a column list that is compatible with the column data types in the external table. However, the column names don't have to match.

## Usage notes
<a name="r_INSERT_external_table_usage_notes"></a>

The number of columns in the SELECT query must be the same as the sum of data columns and partition columns. The location and the data type of each data column must match that of the external table. The location of partition columns must be at the end of the SELECT query, in the same order they were defined in CREATE EXTERNAL TABLE command. The column names don't have to match.

In some cases, you might want to run the INSERT (external table) command on an AWS Glue Data Catalog or a Hive metastore. In the case of AWS Glue, the IAM role used to create the external schema must have both read and write permissions on Amazon S3 and AWS Glue. If you use an AWS Lake Formation catalog, this IAM role becomes the owner of the new Lake Formation table. This IAM role must at least have the following permissions: 
+ SELECT, INSERT, UPDATE permission on the external table
+ Data location permission on the Amazon S3 path of the external table

To ensure that file names are unique, Amazon Redshift uses the following format for the name of each file uploaded to Amazon S3 by default. 

`<date>_<time>_<microseconds>_<query_id>_<slice-number>_part_<part-number>.<format>`.

An example is `20200303_004509_810669_1007_0001_part_00.parquet`.

Consider the following when running the INSERT (external table) command:
+ External tables that have a format other than PARQUET or TEXTFILE aren't supported.
+ This command supports existing table properties such as 'write.parallel', 'write.maxfilesize.mb', 'compression\$1type’, and 'serialization.null.format'. To update those values, run the ALTER TABLE SET TABLE PROPERTIES command.
+ The 'numRows’ table property is automatically updated toward the end of the INSERT operation. The table property must be defined or added to the table already if it wasn't created by CREATE EXTERNAL TABLE AS operation.
+ The LIMIT clause isn't supported in the outer SELECT query. Instead, use a nested LIMIT clause.
+ You can use the [STL\$1UNLOAD\$1LOG](r_STL_UNLOAD_LOG.md) table to track the files that got written to Amazon S3 by each INSERT (external table) operation.
+ Amazon Redshift supports only Amazon S3 standard encryption for INSERT (external table).

## INSERT (external table) examples
<a name="c_Examples_of_INSERT_external_table"></a>

The following example inserts the results of the SELECT statement into the external table.

```
INSERT INTO spectrum.lineitem
SELECT * FROM local_lineitem;
```

The following example inserts the results of the SELECT statement into a partitioned external table using static partitioning. The partition columns are hardcoded in the SELECT statement. The partition columns must be at the end of the query.

```
INSERT INTO spectrum.customer
SELECT name, age, gender, 'May', 28 FROM local_customer;
```

The following example inserts the results of the SELECT statement into a partitioned external table using dynamic partitioning. The partition columns aren't hardcoded. Data is automatically added to the existing partition folders, or to new folders if a new partition is added.

```
INSERT INTO spectrum.customer
SELECT name, age, gender, month, day FROM local_customer;
```

# LOCK
<a name="r_LOCK"></a>

Restricts access to a database table. This command is only meaningful when it is run inside a transaction block.

The LOCK command obtains a table-level lock in "ACCESS EXCLUSIVE" mode, waiting if necessary for any conflicting locks to be released. Explicitly locking a table in this way causes reads and writes on the table to wait when they are attempted from other transactions or sessions. An explicit table lock created by one user temporarily prevents another user from selecting data from that table or loading data into it. The lock is released when the transaction that contains the LOCK command completes.

Less restrictive table locks are acquired implicitly by commands that refer to tables, such as write operations. For example, if a user tries to read data from a table while another user is updating the table, the data that is read will be a snapshot of the data that has already been committed. (In some cases, queries will stop if they violate serializable isolation rules.) See [Managing concurrent write operations](c_Concurrent_writes.md).

Some DDL operations, such as DROP TABLE and TRUNCATE, create exclusive locks. These operations prevent data reads.

If a lock conflict occurs, Amazon Redshift displays an error message to alert the user who started the transaction in conflict. The transaction that received the lock conflict is stopped. Every time a lock conflict occurs, Amazon Redshift writes an entry to the [STL\$1TR\$1CONFLICT](r_STL_TR_CONFLICT.md) table.

## Syntax
<a name="section_r_LOCK-synopsis"></a>

```
LOCK [ TABLE ] table_name [, ...]
```

## Parameters
<a name="parameters"></a>

TABLE   
Optional keyword.

 *table\$1name*   
Name of the table to lock. You can lock more than one table by using a comma-delimited list of table names. You can't lock views. 

## Example
<a name="example2"></a>

```
begin;

lock event, sales;

...
```

# MERGE
<a name="r_MERGE"></a>

Conditionally merges rows from a source table into a target table. Traditionally, this can only be achieved by using multiple insert, update or delete statements separately. For more information on the operations that MERGE lets you combine, see [UPDATE](https://docs.aws.amazon.com/redshift/latest/dg/r_UPDATE.html), [DELETE](https://docs.aws.amazon.com/redshift/latest/dg/r_DELETE.html), and [INSERT](https://docs.aws.amazon.com/redshift/latest/dg/r_INSERT_30.html).

## Syntax
<a name="r_MERGE-synopsis"></a>

```
MERGE INTO target_table 
USING source_table [ [ AS ] alias ] 
ON match_condition 
[ WHEN MATCHED THEN { UPDATE SET col_name = { expr } [,...] | DELETE }
WHEN NOT MATCHED THEN INSERT [ ( col_name [,...] ) ] VALUES ( { expr } [, ...] ) |
REMOVE DUPLICATES ]
```

## Parameters
<a name="r_MERGE-parameters"></a>

 *target\$1table*  
The temporary or permanent table that the MERGE statement merges into.

 *source\$1table*  
The temporary or permanent table supplying the rows to merge into *target\$1table*. *source\$1table* can also be a Spectrum table. 

 *alias*  
The temporary alternative name for *source\$1table*.  
This parameter is optional. Preceding *alias* with AS is also optional.

 *match\$1condition*  
Specifies equal predicates between the source table column and target table column that are used to determine whether the rows in *source\$1table* can be matched with rows in *target\$1table*. If the condition is met, MERGE runs *matched\$1clause* for that row. Otherwise MERGE runs *not\$1matched\$1clause* for that row.

WHEN MATCHED  
 Specifies the action to be run when the match condition between a source row and a target row evaluates to True. You can specify either an UPDATE action or a DELETE action. 

UPDATE  
 Updates the matched row in *target\$1table*. Only values in the *col\$1name* you specify are updated. 

DELETE  
 Deletes the matched row in *target\$1table*. 

WHEN NOT MATCHED  
 Specifies the action to be run when the match condition is evaluated to False or Unknown. You can only specify the INSERT insert action for this clause. 

INSERT  
 Inserts into *target\$1table* rows from *source\$1table* that don't match any rows in *target\$1table*, according to *match\$1condition*. The target *col\$1name* can be listed in any order. If you don’t provide any *col\$1name* values, the default order is all the table’s columns in their declared order. 

 *col\$1name*  
One or more column names that you want to modify. Don't include the table name when specifying the target column.

 *expr*  
The expression defining the new value for *col\$1name*.

 REMOVE DUPLICATES  
Specifies that the MERGE command runs in simplified mode. Simplified mode has the following requirements:  
+  *target\$1table* and *source\$1table* must have the same number of columns, compatible column types, and the same column order. 
+  Omit the WHEN clause and the UPDATE and INSERT clauses from your MERGE command. 
+  Use the REMOVE DUPLICATES clause in your MERGE command. 
In simplified mode, MERGE does the following:  
+  Rows in *target\$1table* that have a match in *source\$1table* are updated to match the values in *source\$1table*. 
+  Rows in *source\$1table* that don't have a match in *target\$1table* are inserted into *target\$1table*. 
+  When multiple rows in *target\$1table* match the same row in *source\$1table*, the duplicate rows are removed. Amazon Redshift keeps one row and updates it. Duplicate rows that don’t match a row in *source\$1table* remain unchanged. 
Using REMOVE DUPLICATES gives better performance than using WHEN MATCHED and WHEN NOT MATCHED. We recommend using REMOVE DUPLICATES if *target\$1table* and *source\$1table* are compatible and you don't need to preserve duplicate rows in *target\$1table*.

## Usage notes
<a name="r_MERGE_usage_notes"></a>
+ To run MERGE statements, you must be the owner of both *source\$1table* and *target\$1table*, or have the SELECT permission for those tables. Additionally, you must have UPDATE, DELETE, and INSERT permissions for *target\$1table* depending on the operations included in your MERGE statement.
+  *target\$1table* can't be a system table, catalog table, or external table. 
+  *source\$1table* and *target\$1table* can't be the same table. 
+  You can't use the WITH clause in a MERGE statement. 
+  Rows in *target\$1table* can't match multiple rows in *source\$1table*. 

  Consider the following example:

  ```
  CREATE TABLE target (id INT, name CHAR(10));
  CREATE TABLE source (id INT, name CHAR(10));
  
  INSERT INTO target VALUES (1, 'Bob'), (2, 'John');
  INSERT INTO source VALUES (1, 'Tony'), (1, 'Alice'), (3, 'Bill');
  
  MERGE INTO target USING source ON target.id = source.id
  WHEN MATCHED THEN UPDATE SET id = source.id, name = source.name
  WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);
  ERROR: Found multiple matches to update the same tuple.
  
  MERGE INTO target USING source ON target.id = source.id
  WHEN MATCHED THEN DELETE
  WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);
  ERROR: Found multiple matches to update the same tuple.
  ```

  In both MERGE statements, the operation fails because there are multiple rows in the `source` table with an ID value of `1`.
+  *match\$1condition* and *expr* can't partially reference SUPER type columns. For example, if your SUPER type object is an array or a structure, you can't use individual elements of that column for *match\$1condition* or *expr*, but you can use the entire column. 

  Consider the following example:

  ```
  CREATE TABLE IF NOT EXISTS target (key INT, value SUPER);
  CREATE TABLE IF NOT EXISTS source (key INT, value SUPER);
  
  INSERT INTO target VALUES (1, JSON_PARSE('{"key": 88}'));
  INSERT INTO source VALUES (1, ARRAY(1, 'John')), (2, ARRAY(2, 'Bill'));
  
  MERGE INTO target USING source ON target.key = source.key
  WHEN matched THEN UPDATE SET value = source.value[0]
  WHEN NOT matched THEN INSERT VALUES (source.key, source.value[0]);
  ERROR: Partial reference of SUPER column is not supported in MERGE statement.
  ```

  For more information on the SUPER type, see [ SUPER type](https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html).
+ If *source\$1table* is large, defining the join columns from both *target\$1table* and *source\$1table* as the distribution keys can improve performance.
+ To use the REMOVE DUPLICATES clause, you need SELECT, INSERT, and DELETE permissions for *target\$1table*.
+  *source\$1table* can be a view or subquery. Following is an example of a MERGE statement where *source\$1table* is a subquery that removes duplicate rows. 

  ```
  MERGE INTO target
  USING (SELECT id, name FROM source GROUP BY 1, 2) as my_source
  ON target.id = my_source.id
  WHEN MATCHED THEN UPDATE SET id = my_source.id, name = my_source.name
  WHEN NOT MATCHED THEN INSERT VALUES (my_source.id, my_source.name);
  ```
+ The target cannot be a data source of any subquery of the same MERGE statement. For example, the following SQL command returns an error like ERROR: Source view/subquery in Merge statement cannot reference target table. because the subquery references `target` instead of `source`.

  ```
  MERGE INTO target
  USING (SELECT id, name FROM target GROUP BY 1, 2) as my_source
  ON target.id = my_source.id
  WHEN MATCHED THEN UPDATE SET id = my_source.id, name = my_source.name
  WHEN NOT MATCHED THEN INSERT VALUES (my_source.id, my_source.name);
  ```

## Examples
<a name="sub-examples-merge"></a>

The following example creates two tables, then runs a MERGE operation on them, updating matching rows in the target table and inserting rows that don't match. Then it inserts another value into the source table and runs another MERGE operation, this time deleting matching rows and inserting the new row from the source table.

First create and populate the source and target tables.

```
CREATE TABLE target (id INT, name CHAR(10));
CREATE TABLE source (id INT, name CHAR(10));

INSERT INTO target VALUES (101, 'Bob'), (102, 'John'), (103, 'Susan');
INSERT INTO source VALUES (102, 'Tony'), (103, 'Alice'), (104, 'Bill');

SELECT * FROM target;
 id  |    name
-----+------------
 101 | Bob
 102 | John
 103 | Susan
(3 rows)

SELECT * FROM source;
 id  |    name
-----+------------
 102 | Tony
 103 | Alice
 104 | Bill
(3 rows)
```

Next, merge the source table into the target table, updating the target table with matching rows and insert rows from the source table that have no match.

```
MERGE INTO target USING source ON target.id = source.id
WHEN MATCHED THEN UPDATE SET id = source.id, name = source.name
WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);

SELECT * FROM target;
 id  |    name
-----+------------
 101 | Bob
 102 | Tony
 103 | Alice
 104 | Bill
(4 rows)
```

Note that the rows with id values of 102 and 103 are updated to match the name values from the target table. Also, a new row with an id value of 104 and name value of Bill is inserted into the target table.

Next, insert a new row into the source table.

```
INSERT INTO source VALUES (105, 'David');

SELECT * FROM source;
 id  |    name
-----+------------
 102 | Tony
 103 | Alice
 104 | Bill
 105 | David
(4 rows)
```

Finally, run a merge operation deleting matching rows in the target table, and inserting rows that don't match.

```
MERGE INTO target USING source ON target.id = source.id
WHEN MATCHED THEN DELETE
WHEN NOT MATCHED THEN INSERT VALUES (source.id, source.name);

SELECT * FROM target;
 id  |    name
-----+------------
 101 | Bob
 105 | David
(2 rows)
```

The rows with id values 102, 103, and 104 are deleted from the target table, and a new row with an id value of 105 and name value of David is inserted into the target table.

The following example shows the simplified syntax of a MERGE command that uses the REMOVE DUPLICATES clause.

```
CREATE TABLE target (id INT, name CHAR(10));
CREATE TABLE source (id INT, name CHAR(10));

INSERT INTO target VALUES (30, 'Tony'), (11, 'Alice'), (23, 'Bill');
INSERT INTO source VALUES (23, 'David'), (22, 'Clarence');

MERGE INTO target USING source ON target.id = source.id REMOVE DUPLICATES;

SELECT * FROM target;
id | name
---+------------
30 | Tony
11 | Alice
23 | David
22 | Clarence
(4 rows)
```

The following example shows the simplified syntax of a MERGE command that uses the REMOVE DUPLICATES clause, removing duplicate rows from *target\$1table* if they have matching rows in *source\$1table*.

```
CREATE TABLE target (id INT, name CHAR(10));
CREATE TABLE source (id INT, name CHAR(10));

INSERT INTO target VALUES (30, 'Tony'), (30, 'Daisy'), (11, 'Alice'), (23, 'Bill'), (23, 'Nikki');
INSERT INTO source VALUES (23, 'David'), (22, 'Clarence');

MERGE INTO target USING source ON target.id = source.id REMOVE DUPLICATES;

SELECT * FROM target;
id | name
---+------------
30 | Tony
30 | Daisy
11 | Alice
23 | David
22 | Clarence
(5 rows)
```

After MERGE runs, there's only one row with an ID value of 23 in *target\$1table*. Because there was no row in *source\$1table* with the ID value 30, the two duplicate rows with ID values of 30 remain in *target\$1table*.

## See also
<a name="r_MERGE-see-also"></a>

 [INSERT](r_INSERT_30.md), [UPDATE](r_UPDATE.md), [DELETE](r_DELETE.md) 

# PREPARE
<a name="r_PREPARE"></a>

Prepare a statement for execution. 

PREPARE creates a prepared statement. When the PREPARE statement is run, the specified statement (SELECT, INSERT, UPDATE, or DELETE) is parsed, rewritten, and planned. When an EXECUTE command is then issued for the prepared statement, Amazon Redshift may optionally revise the query execution plan (to improve performance based on the specified parameter values) before running the prepared statement. 

## Syntax
<a name="r_PREPARE-synopsis"></a>

```
PREPARE plan_name [ (datatype [, ...] ) ] AS statement
```

## Parameters
<a name="r_PREPARE-parameters"></a>

 *plan\$1name*   
An arbitrary name given to this particular prepared statement. It must be unique within a single session and is subsequently used to run or deallocate a previously prepared statement.

 *datatype*   
The data type of a parameter to the prepared statement. To refer to the parameters in the prepared statement itself, use \$11, \$12, and so on, up to a maximum of \$132767.

 *statement *   
Any SELECT, INSERT, UPDATE, or DELETE statement.

## Usage notes
<a name="r_PREPARE_usage_notes"></a>

Prepared statements can take parameters: values that are substituted into the statement when it is run. To include parameters in a prepared statement, supply a list of data types in the PREPARE statement, and, in the statement to be prepared itself, refer to the parameters by position using the notation \$11, \$12, ... The maximum number of parameters is 32767. When running the statement, specify the actual values for these parameters in the EXECUTE statement. For more details, see [EXECUTE](r_EXECUTE.md). 

Prepared statements only last for the duration of the current session. When the session ends, the prepared statement is discarded, so it must be re-created before being used again. This also means that a single prepared statement can't be used by multiple simultaneous database clients; however, each client can create its own prepared statement to use. The prepared statement can be manually removed using the DEALLOCATE command. 

Prepared statements have the largest performance advantage when a single session is being used to run a large number of similar statements. As mentioned, for each new execution of a prepared statement, Amazon Redshift may revise the query execution plan to improve performance based on the specified parameter values. To examine the query execution plan that Amazon Redshift has chosen for any specific EXECUTE statements, use the [EXPLAIN](r_EXPLAIN.md) command. 

For more information on query planning and the statistics collected by Amazon Redshift for query optimization, see the [ANALYZE](r_ANALYZE.md) command. 

## Examples
<a name="sub-examples-prepare"></a>

Create a temporary table, prepare INSERT statement and then run it:

```
DROP TABLE IF EXISTS prep1;
CREATE TABLE prep1 (c1 int, c2 char(20));
PREPARE prep_insert_plan (int, char)
AS insert into prep1 values ($1, $2);
EXECUTE prep_insert_plan (1, 'one');
EXECUTE prep_insert_plan (2, 'two');
EXECUTE prep_insert_plan (3, 'three');
DEALLOCATE prep_insert_plan;
```

Prepare a SELECT statement and then run it:

```
PREPARE prep_select_plan (int)
AS select * from prep1 where c1 = $1;
EXECUTE prep_select_plan (2);
EXECUTE prep_select_plan (3);
DEALLOCATE prep_select_plan;
```

## See also
<a name="r_PREPARE-see-also"></a>

 [DEALLOCATE](r_DEALLOCATE.md), [EXECUTE](r_EXECUTE.md) 

# REFRESH MATERIALIZED VIEW
<a name="materialized-view-refresh-sql-command"></a>

Refreshes a materialized view.

When you create a materialized view, its contents reflect the state of the underlying database table or tables at that time. The data in the materialized view remains unchanged, even when applications make changes to the data in the underlying tables.

To update the data in a materialized view, you can use the `REFRESH MATERIALIZED VIEW` statement at any time. When you use this statement, Amazon Redshift identifies changes that have taken place in the base table or tables, and then applies those changes to the materialized view.

For more information about materialized views, see [Materialized views in Amazon Redshift](materialized-view-overview.md).

## Syntax
<a name="mv_REFRESH_MATERIALIZED_VIEW-synopsis"></a>

```
REFRESH MATERIALIZED VIEW mv_name [ RESTRICT | CASCADE ]
```

## Parameters
<a name="mv_REFRESH_MATERIALIZED_VIEW-parameters"></a>

*mv\$1name*  
The name of the materialized view to be refreshed.

RESTRICT  
Optional keyword. Refreshes the specified materialized view but not its dependent materialized views. The default if neither RESTRICT nor CASCADE is specified.

CASCADE  
Optional keyword. Refreshes the specified materialized view and all its dependent materialized views.

## Usage notes
<a name="mv_REFRESH_MARTERIALIZED_VIEW_usage"></a>

Only the owner of a materialized view can perform a `REFRESH MATERIALIZED VIEW` operation on that materialized view. Furthermore, the owner must have SELECT privilege on the underlying base tables to successfully run `REFRESH MATERIALIZED VIEW`. 

The `REFRESH MATERIALIZED VIEW` command runs as a transaction of its own. Amazon Redshift transaction semantics are followed to determine what data from base tables is visible to the `REFRESH` command, or when the changes made by the `REFRESH` command are made visible to other transactions running in Amazon Redshift.
+ For incremental materialized views, `REFRESH MATERIALIZED VIEW` uses only those base table rows that are already committed. Therefore, if the refresh operation runs after a data manipulation language (DML) statement in the same transaction, then changes of that DML statement aren't visible to refresh. 
+ For a full refresh of a materialized view, `REFRESH MATERIALIZED VIEW` sees all base table rows visible to the refresh transaction, according to usual Amazon Redshift transaction semantics. 
+ Depending on the input argument type, Amazon Redshift still supports incremental refresh for materialized views for the following functions with specific input argument types: DATE (timestamp), DATE\$1PART (date, time, interval, time-tz), DATE\$1TRUNC (timestamp, interval).
+ Incremental refresh is supported on a materialized view where the base table is in a datashare.
+ Refresh of shared materialized views from remote datasharing clusters is not supported for materialized views that contain references to other materialized views, Spectrum tables, tables defined in a different Redshift cluster or UDFs. Such materialized views can be refreshed from the local (producer) cluster.

Some operations in Amazon Redshift interact with materialized views. Some of these operations might force a `REFRESH MATERIALIZED VIEW` operation to fully recompute the materialized view even though the query defining the materialized view only uses the SQL features eligible for incremental refresh. For example:
+ Background vacuum operations might be blocked if materialized views aren't refreshed. After an internally defined threshold period, a vacuum operation is allowed to run. When this vacuum operation happens, any dependent materialized views are marked for recomputation upon the next refresh (even if they are incremental). For information about VACUUM, see [VACUUM](r_VACUUM_command.md). For more information about events and state changes, see [STL\$1MV\$1STATE](r_STL_MV_STATE.md).
+ Some user-initiated operations on base tables force a materialized view to be fully recomputed next time that a REFRESH operation is run. Examples of such operations are a manually invoked VACUUM, a classic resize, an ALTER DISTKEY operation, an ALTER SORTKEY operation, and a truncate operation. Automatic operations in some cases can also result in a materialized view being fully recomputed the next time a REFRESH operation is run. For example, an auto-vacuum delete operation can cause a full recompute. For more information about events and state changes, see [STL\$1MV\$1STATE](r_STL_MV_STATE.md). 

## Cascading refresh
<a name="mv_REFRESH_MATERIALIZED_VIEW_cascading"></a>

The CASCADE option refreshes the specified materialized view and all its dependent materialized views, in order of dependence: base MVs are REFRESHed before the MVs on top (topological ordering). This allows you to update a nested set of materialized views in a single command.

The RESTRICT option (the default if neither RESTRICT nor CASCADE is specified) refreshes only the specified materialized view.

When using the CASCADE option, the following rules apply:
+ Only the owner of the materialized view or a superuser can execute the `REFRESH MATERIALIZED VIEW ... CASCADE`command.
+ If any of the materialized views in the cascade cannot be refreshed, the entire cascade operation will stop.

The cascading refresh functionality is only supported for MVs nested on top of local and streaming materialized views. Materialized views with other source types, such as Spectrum or Data Sharing, are not supported in cascade mode. CASCADE executes refresh in a single transaction for all nested MVs.

## Incremental refresh for materialized views in a datashare
<a name="mv_REFRESH_MATERIALIZED_VIEW_datashare"></a>

 Amazon Redshift supports automatic and incremental refresh for materialized views in a consumer datashare when the base tables are shared. Incremental refresh is an operation where Amazon Redshift identifies changes in the base table or tables that happened after the previous refresh and updates only the corresponding records in the materialized view. For more information about this behavior, see [CREATE MATERIALIZED VIEW](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MARTERIALIZED_VIEW_datashare). 

## Limitations for incremental refresh
<a name="mv_REFRESH_MARTERIALIZED_VIEW_limitations"></a>

Amazon Redshift currently doesn't support incremental refresh for materialized views that are defined with a query using any of the following SQL elements:
+ OUTER JOIN (RIGHT, LEFT, or FULL).
+ Set operations: UNION, INTERSECT, EXCEPT, MINUS.
+ UNION ALL when it occurs in a subquery and an aggregate function, or a GROUP BY clause is present in the query, or the target materialized view contains a sortkey.
+ Aggregate functions: MEDIAN, PERCENTILE\$1CONT, LISTAGG, STDDEV\$1SAMP, STDDEV\$1POP, APPROXIMATE COUNT, APPROXIMATE PERCENTILE, and bitwise aggregate functions.
**Note**  
The COUNT, SUM, MIN, MAX, and AVG aggregate functions are supported.
+ DISTINCT aggregate functions, such as DISTINCT COUNT, DISTINCT SUM, and so on.
+ Window functions.
+ A query that uses temporary tables for query optimization, such as optimizing common subexpressions.
+ Subqueries
+ External tables referencing the following formats in the query that defines the materialized view. 
  +  Delta Lake 
  +  Hudi 

  Incremental refresh is supported for materialized views defined using formats other than those listed above. For more information, see [Materialized views on external data lake tables in Amazon Redshift SpectrumMaterialized views on external data lake tables](materialized-view-external-table.md). 
+ Mutable functions, such as date-time functions, RANDOM and non-STABLE user-defined functions.
+ For limitations regarding incremental refresh for zero-ETL integrations, see [Considerations when using zero-ETL integrations with Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/zero-etl.reqs-lims.html).
+ Accessing tables from more than one database.

For more information about materialized-view limitations, including the effect of background operations like VACUUM on materialized-view refresh operations, see [Usage notes](#mv_REFRESH_MARTERIALIZED_VIEW_usage).

## Examples
<a name="mv_REFRESH_MARTERIALIZED_VIEW_examples"></a>

The following example refreshes the `tickets_mv` materialized view.

```
REFRESH MATERIALIZED VIEW tickets_mv;
```

The following example refreshes the `products_mv` materialized view and all its dependent materialized views:

```
REFRESH MATERIALIZED VIEW products_mv CASCADE; 
```

# RESET
<a name="r_RESET"></a>

Restores the value of a configuration parameter to its default value.

You can reset either a single specified parameter or all parameters at once. To set a parameter to a specific value, use the [SET](r_SET.md) command. To display the current value of a parameter, use the [SHOW](r_SHOW.md) command.

## Syntax
<a name="r_RESET-synopsis"></a>

```
RESET { parameter_name | ALL }
```

The following statement sets the value of a session context variable to NULL.

```
RESET { variable_name | ALL }
```

## Parameters
<a name="r_RESET-parameters"></a>

 *parameter\$1name*   
Name of the parameter to reset. See [Modifying the server configuration](cm_chap_ConfigurationRef.md#t_Modifying_the_default_settings) for more documentation about parameters.

ALL   
Resets all runtime parameters, including all the session context variables.

*variable*   
The name of the variable to reset. If the value to RESET is a session context variable, Amazon Redshift sets it to NULL.

## Examples
<a name="r_RESET-examples"></a>

The following example resets the `query_group` parameter to its default value: 

```
reset query_group;
```

The following example resets all runtime parameters to their default values. 

```
reset all;
```

The following example resets the context variable.

```
RESET app_context.user_id;
```

# REVOKE
<a name="r_REVOKE"></a>

Removes access permissions, such as permissions to create, drop, or update tables, from a user or role.

You can only GRANT or REVOKE USAGE permissions on an external schema to database users and roles using the ON SCHEMA syntax. When using ON EXTERNAL SCHEMA with AWS Lake Formation, you can only GRANT and REVOKE permissions to an AWS Identity and Access Management (IAM) role. For the list of permissions, see the syntax.

For stored procedures, USAGE ON LANGUAGE `plpgsql` permissions are granted to PUBLIC by default. EXECUTE ON PROCEDURE permission is granted only to the owner and superusers by default.

Specify in the REVOKE command the permissions that you want to remove. To give permissions, use the [GRANT](r_GRANT.md) command. 

## Syntax
<a name="r_REVOKE-synopsis"></a>

```
REVOKE [ GRANT OPTION FOR ]
{ { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES | ALTER | TRUNCATE } [,...] | ALL [ PRIVILEGES ] }
ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] }
FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
{ { CREATE | TEMPORARY | TEMP | ALTER } [,...] | ALL [ PRIVILEGES ] }
ON DATABASE db_name [, ...]
FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
{ { CREATE | USAGE | ALTER | DROP } [,...] | ALL [ PRIVILEGES ] }
ON SCHEMA schema_name [, ...]
FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
EXECUTE
    ON FUNCTION function_name ( [ [ argname ] argtype [, ...] ] ) [, ...]
    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
{ { EXECUTE } [,...] | ALL [ PRIVILEGES ] }
    ON PROCEDURE procedure_name ( [ [ argname ] argtype [, ...] ] ) [, ...]
    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
USAGE
    ON LANGUAGE language_name [, ...]
    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
[ RESTRICT ]

REVOKE [GRANT OPTION FOR] 
{ { ALTER | DROP} [,...] | ALL [ PRIVILEGES ] }
    ON COPY JOB job_name [,...]
    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]    

REVOKE [GRANT OPTION FOR]
{ { ALTER | DROP | USAGE } [,...] | ALL [ PRIVILEGES ] }
    ON TEMPLATE template_name [,...]
    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Revoking column-level permissions for tables
<a name="revoke-column-level"></a>

The following is the syntax for column-level permissions on Amazon Redshift tables and views. 

```
REVOKE { { SELECT | UPDATE } ( column_name [, ...] ) [, ...] | ALL [ PRIVILEGES ] ( column_name [,...] ) }
     ON { [ TABLE ] table_name [, ...] }
     FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
     [ RESTRICT ]
```

### Revoking ASSUMEROLE permissions
<a name="revoke-assumerole-permissions"></a>

The following is the syntax to revoke the ASSUMEROLE permission from users and groups with a specified role. 

```
REVOKE ASSUMEROLE
    ON { 'iam_role' [, ...]  | default | ALL }
    FROM { user_name | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
    FOR { ALL | COPY | UNLOAD | EXTERNAL FUNCTION | CREATE MODEL }
```

### Revoking permissions for Redshift Spectrum for Lake Formation
<a name="revoke-spectrum-integration-with-lf-permissions"></a>

The following is the syntax for Redshift Spectrum integration with Lake Formation.

```
REVOKE [ GRANT OPTION FOR ]
{ SELECT | ALL [ PRIVILEGES ] } ( column_list )
    ON EXTERNAL TABLE schema_name.table_name
    FROM { IAM_ROLE iam_role } [, ...]

REVOKE [ GRANT OPTION FOR ]
{ { SELECT | ALTER | DROP | DELETE | INSERT }  [, ...] | ALL [ PRIVILEGES ] }
    ON EXTERNAL TABLE schema_name.table_name [, ...]
    FROM { { IAM_ROLE iam_role } [, ...] | PUBLIC }

REVOKE [ GRANT OPTION FOR ]
{ { CREATE | ALTER | DROP }  [, ...] | ALL [ PRIVILEGES ] }
    ON EXTERNAL SCHEMA schema_name [, ...]
    FROM { IAM_ROLE iam_role } [, ...]
```

### Revoking datashare permissions
<a name="revoke-datashare-permissions"></a>

**Producer-side datashare permissions**  
The following is the syntax for using REVOKE to remove ALTER or SHARE permissions from a user or role. The user whose permissions have been revoked can no longer alter the datashare, or grant usage to a consumer. 

```
REVOKE { ALTER | SHARE } ON DATASHARE datashare_name
 FROM { username [ WITH GRANT OPTION ] | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

The following is the syntax for using REVOKE to remove a consumer’s access to a datashare.

```
REVOKE USAGE
 ON DATASHARE datashare_name
 FROM NAMESPACE 'namespaceGUID' [, ...] | ACCOUNT 'accountnumber' [ VIA DATA CATALOG ] [, ...]
```

The following is an example of revoking usage of a datashare from a Lake Formation account.

```
REVOKE USAGE ON DATASHARE salesshare FROM ACCOUNT '123456789012' VIA DATA CATALOG;
```

**Consumer-side datashare permissions**  
The following is the REVOKE syntax for data-sharing usage permissions on a specific database or schema created from a datashare. Revoking usage permission from a database created with the WITH PERMISSIONS clause doesn't revoke any additional permissions you granted a user or role, including object-level permissions granted for underlying objects. If you re-grant usage permission to that user or role, they will retain all additional permissions that they had before you revoked usage.

```
REVOKE USAGE ON { DATABASE shared_database_name [, ...] | SCHEMA shared_schema}
 FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
```

### Revoking scoped permissions
<a name="revoke-scoped-permissions"></a>

Scoped permissions let you grant permissions to a user or role on all objects of a type within a database or schema. Users and roles with scoped permissions have the specified permissions on all current and future objects within the database or schema.

You can view the scope of database-level scoped permissions in [SVV\$1DATABASE\$1PRIVILEGES](r_SVV_DATABASE_PRIVILEGES.md). You can view the scope of schema-level scoped permissions in [SVV\$1SCHEMA\$1PRIVILEGES](r_SVV_SCHEMA_PRIVILEGES.md).

For more information about scoped permissions, see [Scoped permissions](t_scoped-permissions.md).

The following is the syntax for revoking scoped permissions from users and roles. 

```
REVOKE [ GRANT OPTION ] 
{ CREATE | USAGE | ALTER | DROP } [,...] | ALL [ PRIVILEGES ] }
FOR SCHEMAS IN
DATABASE db_name 
FROM { username | ROLE role_name } [, ...]

REVOKE [ GRANT OPTION ]
{ { SELECT | INSERT | UPDATE | DELETE | DROP | ALTER | TRUNCATE | REFERENCES } [, ...] } | ALL [PRIVILEGES] } }
FOR TABLES IN
{ SCHEMA schema_name [ DATABASE db_name ] | DATABASE db_name }
FROM { username | ROLE role_name } [, ...]

REVOKE [ GRANT OPTION ] { EXECUTE | ALL [ PRIVILEGES ] }
FOR FUNCTIONS IN 
{ SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
FROM { username | ROLE role_name } [, ...]

REVOKE [ GRANT OPTION ] { EXECUTE | ALL [ PRIVILEGES ] }
FOR PROCEDURES IN
{ SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
FROM { username | ROLE role_name } [, ...]

REVOKE [ GRANT OPTION ] USAGE
FOR LANGUAGES IN
DATABASE db_name
FROM { username | ROLE role_name } [, ...]  

REVOKE [GRANT_OPTION] 
{ { CREATE | ALTER | DROP} [,...] | ALL [ PRIVILEGES ] }
FOR COPY JOBS 
IN DATABASE db_name
FROM { username [ WITH GRANT OPTION ] | ROLE role_name } [, ...]      

REVOKE [ GRANT OPTION ]
{ {ALTER | DROP  | USAGE } [,...] | ALL [ PRIVILEGES ] }
FOR TEMPLATES IN
{ SCHEMA schema_name [DATABASE db_name ] | DATABASE db_name }
FROM { username | ROLE role_name } [, ...]
```

Note that scoped permissions don’t distinguish between permissions for functions and for procedures. For example, the following statement revokes `EXECUTE` permissions for both functions and procedures from `bob` in the schema `Sales_schema`. 

```
REVOKE EXECUTE FOR FUNCTIONS IN SCHEMA Sales_schema FROM bob;
```

### Revoking machine learning permissions
<a name="revoke-model-permissions"></a>

The following is the syntax for machine learning model permissions on Amazon Redshift.

```
REVOKE [ GRANT OPTION FOR ]
    CREATE MODEL FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
    [ RESTRICT ]

REVOKE [ GRANT OPTION FOR ]
    { EXECUTE | ALL [ PRIVILEGES ] }
    ON MODEL model_name [, ...]

    FROM { username | ROLE role_name | GROUP group_name | PUBLIC } [, ...]
    [ RESTRICT ]
```

### Revoking role permissions
<a name="revoke-roles"></a>

The following is the syntax for revoking role permissions on Amazon Redshift.

```
REVOKE [ ADMIN OPTION FOR ] { ROLE role_name } [, ...] FROM { user_name } [, ...]
```

```
REVOKE { ROLE role_name } [, ...] FROM { ROLE role_name } [, ...]
```

The following is the syntax for revoking system permissions to roles on Amazon Redshift.

```
REVOKE
  {
    { CREATE USER | DROP USER | ALTER USER |
    CREATE SCHEMA | DROP SCHEMA |
    ALTER DEFAULT PRIVILEGES |
    ACCESS CATALOG |
    CREATE TABLE | DROP TABLE | ALTER TABLE |
    CREATE OR REPLACE FUNCTION | CREATE OR REPLACE EXTERNAL FUNCTION |
    DROP FUNCTION |
    CREATE OR REPLACE PROCEDURE | DROP PROCEDURE |
    CREATE OR REPLACE VIEW | DROP VIEW |
    CREATE MODEL | DROP MODEL |
    CREATE DATASHARE | ALTER DATASHARE | DROP DATASHARE |
    CREATE LIBRARY | DROP LIBRARY |
    CREATE ROLE | DROP ROLE
    TRUNCATE TABLE
    VACUUM | ANALYZE | CANCEL }[, ...]
  }
  | { ALL [ PRIVILEGES ] }
FROM { ROLE role_name } [, ...]
```

### Revoking permissions for security policies
<a name="revoke-role-level"></a>

The following is the syntax for revoking permissions to explain the security policy filters of a query in the EXPLAIN plan. Possible security policies include row-level security policies and dynamic data masking policies.

```
REVOKE EXPLAIN { RLS | MASKING } FROM ROLE rolename 
```

The following is the syntax for revoking permissions to bypass row-level security policies for a query. 

```
REVOKE IGNORE RLS FROM ROLE rolename 
```

The following is the syntax for revoking SELECT permissions from the specified security policy. Possible security policies include row-level security policies and dynamic data masking policies.

```
REVOKE SELECT ON [ TABLE ] table_name [, ...]
            FROM { RLS | MASKING } POLICY policy_name [, ...]
```

## Parameters
<a name="r_REVOKE-parameters"></a>

GRANT OPTION FOR   
Revokes only the option to grant a specified permission to other users and doesn't revoke the permission itself. You can't revoke GRANT OPTION from a group or from PUBLIC.

SELECT   
Revokes the permission to select data from a table or a view using a SELECT statement.

INSERT   
Revokes the permission to load data into a table using an INSERT statement or a COPY statement. 

UPDATE   
Revokes the permission to update a table column using an UPDATE statement. 

DELETE   
Revokes the permission to delete a data row from a table.

REFERENCES   
Revokes the permission to create a foreign key constraint. You should revoke this permission on both the referenced table and the referencing table.

TRUNCATE  
Revokes the permission to truncate a table. Without this permission, only the owner of a table or a superuser can truncate a table. For more information about the TRUNCATE command, see [TRUNCATE](r_TRUNCATE.md).

ALL [ PRIVILEGES ]   
Revokes all available permissions at once from the specified user or group. The PRIVILEGES keyword is optional.  
 Amazon Redshift doesn't support the RULE and TRIGGER permissions. For more information, go to [Unsupported PostgreSQL features](c_unsupported-postgresql-features.md). 

ALTER  
Depending on the database object, revokes the following permissions from the user or user group:   
+ For tables, ALTER revokes permission to alter a table or view. For more information, see [ALTER TABLE](r_ALTER_TABLE.md).
+ For databases, ALTER revokes permission to alter a database. For more information, see [ALTER DATABASE](r_ALTER_DATABASE.md).
+ For schemas, ALTER grants revokes to alter a schema. For more information, see [ALTER SCHEMA](r_ALTER_SCHEMA.md).
+ For external tables, ALTER revokes permission to alter a table in an AWS Glue Data Catalog that is enabled for Lake Formation. This permission only applies when using Lake Formation.

DROP  
Depending on the database object, revokes the following permissions from the user or role:  
+  For tables, DROP revokes permission to drop a table or view. For more information, see [DROP TABLE](r_DROP_TABLE.md). 
+  For databases, DROP revokes permission to drop a database. For more information, see [DROP DATABASE](r_DROP_DATABASE.md). 
+  For schemas, DROP revokes permission to drop a schema. For more information, see [DROP SCHEMA](r_DROP_SCHEMA.md). 

ASSUMEROLE  <a name="assumerole"></a>
Revokes the permission to run COPY, UNLOAD, EXTERNAL FUNCTION, or CREATE MODEL commands from users, roles, or groups with a specified role. 

ON [ TABLE ] *table\$1name*   
Revokes the specified permissions on a table or a view. The TABLE keyword is optional.

ON ALL TABLES IN SCHEMA *schema\$1name*   
Revokes the specified permissions on all tables in the referenced schema.

( *column\$1name* [,...] ) ON TABLE *table\$1name*   <a name="revoke-column-level-privileges"></a>
Revokes the specified permissions from users, groups, or PUBLIC on the specified columns of the Amazon Redshift table or view.

( *column\$1list* ) ON EXTERNAL TABLE *schema\$1name.table\$1name*   <a name="revoke-external-table-column"></a>
Revokes the specified permissions from an IAM role on the specified columns of the Lake Formation table in the referenced schema.

ON EXTERNAL TABLE *schema\$1name.table\$1name*   <a name="revoke-external-table"></a>
Revokes the specified permissions from an IAM role on the specified Lake Formation tables in the referenced schema.

ON EXTERNAL SCHEMA *schema\$1name*   <a name="revoke-external-schema"></a>
Revokes the specified permissions from an IAM role on the referenced schema.

FROM IAM\$1ROLE *iam\$1role*   <a name="revoke-from-iam-role"></a>
Indicates the IAM role losing the permissions.

ROLE *role\$1name*   
Revokes the permissions from the specified role.

GROUP *group\$1name*   
Revokes the permissions from the specified user group.

PUBLIC   
Revokes the specified permissions from all users. PUBLIC represents a group that always includes all users. An individual user's permissions consist of the sum of permissions granted to PUBLIC, permissions granted to any groups that the user belongs to, and any permissions granted to the user individually.  
Revoking PUBLIC from a Lake Formation external table results in revoking the permission from the Lake Formation *everyone* group.

CREATE   
Depending on the database object, revokes the following permissions from the user or group:  
+ For databases, using the CREATE clause for REVOKE prevents users from creating schemas within the database.
+ For schemas, using the CREATE clause for REVOKE prevents users from creating objects within a schema. To rename an object, the user must have the CREATE permission and own the object to be renamed. 
By default, all users have CREATE and USAGE permissions on the PUBLIC schema.

TEMPORARY \$1 TEMP   
Revokes the permission to create temporary tables in the specified database.  
By default, users are granted permission to create temporary tables by their automatic membership in the PUBLIC group. To remove the permission for any users to create temporary tables, revoke the TEMP permission from the PUBLIC group and then explicitly grant the permission to create temporary tables to specific users or groups of users.

ON DATABASE *db\$1name*   
Revokes the permissions on the specified database.

USAGE   
Revokes USAGE permissions on objects within a specific schema, which makes these objects inaccessible to users. Specific actions on these objects must be revoked separately (such as the EXECUTE permission on functions).  
By default, all users have CREATE and USAGE permissions on the PUBLIC schema.

ON SCHEMA *schema\$1name*   
Revokes the permissions on the specified schema. You can use schema permissions to control the creation of tables; the CREATE permission for a database only controls the creation of schemas.

RESTRICT   
Revokes only those permissions that the user directly granted. This behavior is the default.

EXECUTE ON PROCEDURE *procedure\$1name*   
Revokes the EXECUTE permission on a specific stored procedure. Because stored procedure names can be overloaded, you must include the argument list for the procedure. For more information, see [Naming stored procedures](stored-procedure-naming.md).

EXECUTE ON ALL PROCEDURES IN SCHEMA *procedure\$1name*   
Revokes the specified permissions on all procedures in the referenced schema.

USAGE ON LANGUAGE *language\$1name*   
Revokes the USAGE permission on a language. For Python user-defined functions (UDFs), use `plpythonu`. For SQL UDFs, use `sql`. For stored procedures, use `plpgsql`.   
To create a UDF, you must have permission for usage on language for SQL or `plpythonu` (Python). By default, USAGE ON LANGUAGE SQL is granted to PUBLIC. However, you must explicitly grant USAGE ON LANGUAGE PLPYTHONU to specific users or groups.   
To revoke usage for SQL, first revoke usage from PUBLIC. Then grant usage on SQL only to the specific users or groups permitted to create SQL UDFs. The following example revokes usage on SQL from PUBLIC then grants usage to the user group `udf_devs`.   

```
revoke usage on language sql from PUBLIC;
grant usage on language sql to group udf_devs;
```
For more information, see [UDF security and permissions](udf-security-and-privileges.md).   
To revoke usage for stored procedures, first revoke usage from PUBLIC. Then grant usage on `plpgsql` only to the specific users or groups permitted to create stored procedures. For more information, see [Security and privileges for stored procedures](stored-procedure-security-and-privileges.md). 

ON COPY JOB *job\$1name*  <a name="on-copy-job-revoke"></a>
Revokes the specified permissions on a copy job.

FOR \$1 ALL \$1 COPY \$1 UNLOAD \$1 EXTERNAL FUNCTION \$1 CREATE MODEL \$1 [, ...]  <a name="revoke-for"></a>
Specifes the SQL command for which the permission is revoked. You can specify ALL to revoke the permission on the COPY, UNLOAD, EXTERNAL FUNCTION, and CREATE MODEL statements. This clause applies only to revoking the ASSUMEROLE permission.

ALTER  
Revokes the ALTER permission for users or user groups that allows those that don't own a datashare to alter the datashare. This permission is required to add or remove objects from a datashare, or to set the property PUBLICACCESSIBLE. For more information, see [ALTER DATASHARE](r_ALTER_DATASHARE.md).

SHARE  
Revokes permissions for users and user groups to add consumers to a datashare. Revoking this permissionis required to stop the particular consumer from accessing the datashare from its clusters. 

ON DATASHARE *datashare\$1name *  
Grants the specified permissions on the referenced datashare.

FROM username  
Indicates the user losing the permissions.

FROM GROUP *group\$1name*  
Indicates the user group losing the permissions.

WITH GRANT OPTION  
Indicates that the user losing the permissions can in turn revoke the same permissions for others. You can't revoke WITH GRANT OPTION for a group or for PUBLIC. 

USAGE  
When USAGE is revoked for a consumer account or namespace within the same account, the specified consumer account or namespace within an account can't access the datashare and the objects of the datashare in read-only fashion.   
Revoking the USAGE permission revokes the access to a datashare from consumers. 

FROM NAMESPACE 'clusternamespace GUID'   
Indicates the namespace in the same account that has consumers losing the permissions to the datashare. Namespaces use a 128-bit alphanumeric globally unique identifier (GUID).

FROM ACCOUNT 'accountnumber' [ VIA DATA CATALOG ]  
Indicates the account number of another account that has the consumers losing the permissions to the datashare. Specifying ‘VIA DATA CATALOG’ indicates that you are revoking usage of the datashare from a Lake Formation account. Omitting the account number means that you're revoking from the account that owns the cluster.

ON DATABASE *shared\$1database\$1name> [, ...]*   <a name="revoke-datashare"></a>
Revokes the specified usage permissions on the specified database that was created in the specified datashare. 

ON SCHEMA* shared\$1schema*   <a name="revoke-datashare"></a>
Revokes the specified permissions on the specified schema that was created in the specified datashare.

FOR \$1 SCHEMAS \$1 TABLES \$1 FUNCTIONS \$1 PROCEDURES \$1 LANGUAGES \$1 COPY JOBS\$1 IN   
Specifies the database objects to revoke permission from. The parameters following IN define the scope of the revoked permission.

CREATE MODEL  
Revokes the CREATE MODEL permission to create machine learning models in the specified database.

ON MODEL *model\$1name*  
Revokes the EXECUTE permission for a specific model. 

ACCESS CATALOG  
Revokes the permission to view relevant metadata of objects that the role has access to.

[ ADMIN OPTION FOR ] \$1 role \$1 [, ...]  
The role that you revoke from a specified user that has the WITH ADMIN OPTION.

FROM \$1 role \$1 [, ...]  
The role that you revoke the specified role from.

EXPLAIN \$1 RLS \$1 MASKING \$1 FROM ROLE *rolename*  
Revokes the permission to explain the security policy filters of a query in the EXPLAIN plan from a role. RLS revokes permission to explain row-level security policy filters. MASKING revokes permission to explain dynamic data masking policy filters.

IGNORE RLS FROM ROLE *rolename*   
Revokes the permission to bypass row-level security policies for a query from a role.

FROM \$1 RLS \$1 MASKING \$1 POLICY *policy\$1name*  
Indicates the security policy losing the permissions. TO RLS POLICY indicates a row-level security policy. TO MASKING POLICY indicates a dynamic data masking policy.

## Usage notes
<a name="r_REVOKE-usage-notes-link"></a>

To learn more about the usage notes for REVOKE, see [Usage notes](r_REVOKE-usage-notes.md).

## Examples
<a name="r_REVOKE-examples-link"></a>

For examples of how to use REVOKE, see [Examples](r_REVOKE-examples.md).

# Usage notes
<a name="r_REVOKE-usage-notes"></a>

To revoke privileges from an object, you must meet one of the following criteria:
+ Be the object owner.
+ Be a superuser.
+ Have a grant privilege for that object and privilege.

  For example, the following command enables the user HR both to perform SELECT commands on the employees table and to grant and revoke the same privilege for other users.

  ```
  grant select on table employees to HR with grant option;
  ```

  HR can't revoke privileges for any operation other than SELECT, or on any other table than employees. 

Superusers can access all objects regardless of GRANT and REVOKE commands that set object privileges.

PUBLIC represents a group that always includes all users. By default all members of PUBLIC have CREATE and USAGE privileges on the PUBLIC schema. To restrict any user's permissions on the PUBLIC schema, you must first revoke all permissions from PUBLIC on the PUBLIC schema, then grant privileges to specific users or groups. The following example controls table creation privileges in the PUBLIC schema.

```
revoke create on schema public from public;
```

To revoke privileges from a Lake Formation table, the IAM role associated with the table's external schema must have permission to revoke privileges to the external table. The following example creates an external schema with an associated IAM role `myGrantor`. IAM role `myGrantor` has the permission to revoke permissions from others. The REVOKE command uses the permission of the IAM role `myGrantor` that is associated with the external schema to revoke permission to the IAM role `myGrantee`.

```
create external schema mySchema
from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::123456789012:role/myGrantor'
create external database if not exists;
```

```
revoke select
on external table mySchema.mytable
from iam_role 'arn:aws:iam::123456789012:role/myGrantee';
```

**Note**  
If the IAM role also has the `ALL` permission in an AWS Glue Data Catalog that is enabled for Lake Formation, the `ALL` permission isn't revoked. Only the `SELECT` permission is revoked. You can view the Lake Formation permissions in the Lake Formation console.

## Usage notes for revoking the ASSUMEROLE permission
<a name="r_REVOKE-usage-notes-assumerole"></a>

The following usage notes apply to revoking the ASSUMEROLE privilege in Amazon Redshift. 

Only a database superuser can revoke the ASSUMEROLE privilege for users and groups. A superuser always retains the ASSUMEROLE privilege. 

To enable the use of the ASSUMEROLE privilege for users and groups, a superuser runs the following statement once on the cluster. Before granting the ASSUMEROLE privilege to users and groups, a superuser must run the following statement once on the cluster. 

```
revoke assumerole on all from public for all;
```

## Usage notes for revoking machine learning permissions
<a name="r_REVOKE-usage-notes-create-model"></a>

You can't directly grant or revoke permissions related to an ML function. An ML function belongs to an ML model and permissions are controlled through the model. Instead, you can revoke permissions related to the ML model. The following example demonstrates how to revoke the run permisison from all users associated with the model `customer_churn`.

```
REVOKE EXECUTE ON MODEL customer_churn FROM PUBLIC;
```

You can also revoke all permissions from a user for the ML model `customer_churn`.

```
REVOKE ALL on MODEL customer_churn FROM ml_user;
```

Granting or revoking the `EXECUTE` permission related to an ML function will fail if there is an ML function in the schema, even if that ML function already has the `EXECUTE` permission through `GRANT EXECUTE ON MODEL`. We recommend using a separate schema when using the `CREATE MODEL` command to keep the ML functions in a separate schema by themselves. The following example demonstrates how to do so.

```
CREATE MODEL ml_schema.customer_churn
FROM customer_data
TARGET churn
FUNCTION ml_schema.customer_churn_prediction
IAM_ROLE default
SETTINGS (
 S3_BUCKET 'amzn-s3-demo-bucket'
);
```

# Examples
<a name="r_REVOKE-examples"></a>

The following example revokes INSERT privileges on the SALES table from the GUESTS user group. This command prevents members of GUESTS from being able to load data into the SALES table by using the INSERT command. 

```
revoke insert on table sales from group guests;
```

The following example revokes the SELECT privilege on all tables in the QA\$1TICKIT schema from the user `fred`.

```
revoke select on all tables in schema qa_tickit from fred;
```

The following example revokes the privilege to select from a view for user `bobr`.

```
revoke select on table eventview from bobr;
```

The following example revokes the privilege to create temporary tables in the TICKIT database from all users.

```
revoke temporary on database tickit from public;
```

The following example revokes SELECT privilege on the `cust_name` and `cust_phone` columns of the `cust_profile` table from the user `user1`. 

```
revoke select(cust_name, cust_phone) on cust_profile from user1;
```

The following example revokes SELECT privilege on the `cust_name` and `cust_phone` columns and UPDATE privilege on the `cust_contact_preference` column of the `cust_profile` table from the `sales_group` group. 

```
revoke select(cust_name, cust_phone), update(cust_contact_preference) on cust_profile from group sales_group;
```

The following example shows the usage of the ALL keyword to revoke both SELECT and UPDATE privileges on three columns of the table `cust_profile` from the `sales_admin` group. 

```
revoke ALL(cust_name, cust_phone,cust_contact_preference) on cust_profile from group sales_admin;
```

The following example revokes the SELECT privilege on the `cust_name` column of the `cust_profile_vw` view from the `user2` user. 

```
revoke select(cust_name) on cust_profile_vw from user2;
```

## Examples of revoking the USAGE permission from databases created from datashares
<a name="r_REVOKE-examples-datashare"></a>

The following example revokes access to the `salesshare` datashare from the from the `13b8833d-17c6-4f16-8fe4-1a018f5ed00d` namespace.

```
REVOKE USAGE ON DATASHARE salesshare FROM NAMESPACE '13b8833d-17c6-4f16-8fe4-1a018f5ed00d';
```

The following example revokes the USAGE permission on the `sales_db` from `Bob`.

```
REVOKE USAGE ON DATABASE sales_db FROM Bob;
```

The following example REVOKE USAGE permission on the `sales_schema` from the `Analyst_role`.

```
REVOKE USAGE ON SCHEMA sales_schema FROM ROLE Analyst_role;
```

## Examples of revoking scoped permissions
<a name="r_REVOKE-examples-scoped"></a>

The following example revokes usage for all current and future schemas in the `Sales_db` database from the `Sales` role.

```
REVOKE USAGE FOR SCHEMAS IN DATABASE Sales_db FROM ROLE Sales;
```

The following example revokes the ability to grant the SELECT permission for all current and future tables in the `Sales_db` database from the user `alice`. `alice` retains access to all tables in `Sales_db`.

```
REVOKE GRANT OPTION SELECT FOR TABLES IN DATABASE Sales_db FROM alice;
```

The following example revokes the EXECUTE permission for functions in the `Sales_schema` schema from the user `bob`.

```
REVOKE EXECUTE FOR FUNCTIONS IN SCHEMA Sales_schema FROM bob;
```

The following example revokes all permissions for all tables in the `ShareDb` database’s `ShareSchema` schema from the `Sales` role. When specifying the schema, you can also specify the schema’s database using the two-part format `database.schema`.

```
REVOKE ALL FOR TABLES IN SCHEMA ShareDb.ShareSchema FROM ROLE Sales;
```

The following example is the same as the preceding one. You can specify the schema’s database using the `DATABASE` keyword instead of using a two-part format.

```
REVOKE ALL FOR TABLES IN SCHEMA ShareSchema DATABASE ShareDb FROM ROLE Sales;
```

## Examples of revoking the ASSUMEROLE privilege
<a name="r_REVOKE-examples-assumerole"></a>

The following are examples of revoking the ASSUMEROLE privilege. 

A superuser must enable the use of the ASSUMEROLE privilege for users and groups by running the following statement once on the cluster: 

```
revoke assumerole on all from public for all;
```

The following statement revokes the ASSUMEROLE privilege from user reg\$1user1 on all roles for all operations. 

```
revoke assumerole on all from reg_user1 for all;
```

## Examples of revoking the ROLE privilege
<a name="r_REVOKE-examples-role"></a>

The following example revokes the sample\$1role1 from to sample\$1role2.

```
CREATE ROLE sample_role2;
GRANT ROLE sample_role1 TO ROLE sample_role2;
REVOKE ROLE sample_role1 FROM ROLE sample_role2;
```

The following example revokes system privileges from user1.

```
GRANT ROLE sys:DBA TO user1;
REVOKE ROLE sys:DBA FROM user1;
```

The following example revokes sample\$1role1 and sample\$1role2 from user1.

```
CREATE ROLE sample_role1;
CREATE ROLE sample_role2;
GRANT ROLE sample_role1, ROLE sample_role2 TO user1;
REVOKE ROLE sample_role1, ROLE sample_role2 FROM user1;
```

The following example revokes sample\$1role2 with the ADMIN OPTION from user1.

```
GRANT ROLE sample_role2 TO user1 WITH ADMIN OPTION;
REVOKE ADMIN OPTION FOR ROLE sample_role2 FROM user1;
REVOKE ROLE sample_role2 FROM user1;
```

The following example revokes sample\$1role1 and sample\$1role2 from sample\$1role5.

```
CREATE ROLE sample_role5;
GRANT ROLE sample_role1, ROLE sample_role2 TO ROLE sample_role5;
REVOKE ROLE sample_role1, ROLE sample_role2 FROM ROLE sample_role5;
```

The following example revokes the CREATE SCHEMA and DROP SCHEMA system privileges to sample\$1role1.

```
GRANT CREATE SCHEMA, DROP SCHEMA TO ROLE sample_role1;
REVOKE CREATE SCHEMA, DROP SCHEMA FROM ROLE sample_role1;
```

# ROLLBACK
<a name="r_ROLLBACK"></a>

Stops the current transaction and discards all updates made by that transaction.

This command performs the same function as the [ABORT](r_ABORT.md) command.

## Syntax
<a name="r_ROLLBACK-synopsis"></a>

```
ROLLBACK [ WORK | TRANSACTION ]
```

## Parameters
<a name="r_ROLLBACK-parameters"></a>

WORK  
Optional keyword. This keyword isn't supported within a stored procedure. 

TRANSACTION  
Optional keyword. WORK and TRANSACTION are synonyms. Neither is supported within a stored procedure. 

For information about using ROLLBACK within a stored procedure, see [Managing transactions](stored-procedure-transaction-management.md). 

## Example
<a name="r_ROLLBACK-example"></a>

The following example creates a table then starts a transaction where data is inserted into the table. The ROLLBACK command then rolls back the data insertion to leave the table empty.

The following command creates an example table called MOVIE\$1GROSS:

```
create table movie_gross( name varchar(30), gross bigint );
```

The next set of commands starts a transaction that inserts two data rows into the table:

```
begin;

insert into movie_gross values ( 'Raiders of the Lost Ark', 23400000);

insert into movie_gross values ( 'Star Wars', 10000000 );
```

Next, the following command selects the data from the table to show that it was successfully inserted:

```
select * from movie_gross;
```

The command output shows that both rows successfully inserted:

```
name           |  gross
-------------------------+----------
Raiders of the Lost Ark | 23400000
Star Wars               | 10000000
(2 rows)
```

This command now rolls back the data changes to where the transaction began:

```
rollback;
```

Selecting data from the table now shows an empty table:

```
select * from movie_gross;

name | gross
------+-------
(0 rows)
```

# SELECT
<a name="r_SELECT_synopsis"></a>

Returns rows from tables, views, and user-defined functions. 

**Note**  
The maximum size for a single SQL statement is 16 MB.

## Syntax
<a name="r_SELECT_synopsis-synopsis"></a>

```
[ WITH with_subquery [, ...] ]
SELECT
[ TOP number | [ ALL | DISTINCT ]
* | expression [ AS output_name ] [, ...] ]
[ EXCLUDE column_list ]
[ FROM table_reference [, ...] ]
[ WHERE condition ]
[ [ START WITH expression ] CONNECT BY expression ]
[ GROUP BY ALL | expression [, ...] ]
[ HAVING condition ]
[ QUALIFY condition ]
[ { UNION | ALL | INTERSECT | EXCEPT | MINUS } query ]
[ ORDER BY expression [ ASC | DESC ] ]
[ LIMIT { number | ALL } ]
[ OFFSET start ]
```

**Topics**
+ [Syntax](#r_SELECT_synopsis-synopsis)
+ [WITH clause](r_WITH_clause.md)
+ [SELECT list](r_SELECT_list.md)
+ [EXCLUDE column\$1list](r_EXCLUDE_list.md)
+ [FROM clause](r_FROM_clause30.md)
+ [WHERE clause](r_WHERE_clause.md)
+ [GROUP BY clause](r_GROUP_BY_clause.md)
+ [HAVING clause](r_HAVING_clause.md)
+ [QUALIFY clause](r_QUALIFY_clause.md)
+ [UNION, INTERSECT, and EXCEPT](r_UNION.md)
+ [ORDER BY clause](r_ORDER_BY_clause.md)
+ [CONNECT BY clause](r_CONNECT_BY_clause.md)
+ [Subquery examples](r_Subquery_examples.md)
+ [Correlated subqueries](r_correlated_subqueries.md)

# WITH clause
<a name="r_WITH_clause"></a>

A WITH clause is an optional clause that precedes the SELECT list in a query. The WITH clause defines one or more *common\$1table\$1expressions*. Each common table expression (CTE) defines a temporary table, which is similar to a view definition. You can reference these temporary tables in the FROM clause. They're used only while the query they belong to runs. Each CTE in the WITH clause specifies a table name, an optional list of column names, and a query expression that evaluates to a table (a SELECT statement). When you reference the temporary table name in the FROM clause of the same query expression that defines it, the CTE is recursive. 

WITH clause subqueries are an efficient way of defining tables that can be used throughout the execution of a single query. In all cases, the same results can be achieved by using subqueries in the main body of the SELECT statement, but WITH clause subqueries may be simpler to write and read. Where possible, WITH clause subqueries that are referenced multiple times are optimized as common subexpressions; that is, it may be possible to evaluate a WITH subquery once and reuse its results. (Note that common subexpressions aren't limited to those defined in the WITH clause.)

## Syntax
<a name="r_WITH_clause-synopsis"></a>

```
[ WITH [RECURSIVE] common_table_expression [, common_table_expression , ...] ]
```

where *common\$1table\$1expression* can be either non-recursive or recursive. Following is the non-recursive form: 

```
CTE_table_name [ ( column_name [, ...] ) ] AS ( query )
```

Following is the recursive form of *common\$1table\$1expression*:

```
CTE_table_name (column_name [, ...] ) AS ( recursive_query )
```

## Parameters
<a name="r_WITH_clause-parameters"></a>

 RECURSIVE   
Keyword that identifies the query as a recursive CTE. This keyword is required if any *common\$1table\$1expression* defined in the WITH clause is recursive. You can only specify the RECURSIVE keyword once, immediately following the WITH keyword, even when the WITH clause contains multiple recursive CTEs. In general, a recursive CTE is a UNION ALL subquery with two parts. 

 *common\$1table\$1expression*   
Defines a temporary table that you can reference in the [FROM clause](r_FROM_clause30.md) and is used only during the execution of the query to which it belongs. 

 *CTE\$1table\$1name*   
A unique name for a temporary table that defines the results of a WITH clause subquery. You can't use duplicate names within a single WITH clause. Each subquery must be given a table name that can be referenced in the [FROM clause](r_FROM_clause30.md).

 *column\$1name*   
 A list of output column names for the WITH clause subquery, separated by commas. The number of column names specified must be equal to or less than the number of columns defined by the subquery. For a CTE that is non-recursive, the *column\$1name* clause is optional. For a recursive CTE, the *column\$1name* list is required.

 *query*   
 Any SELECT query that Amazon Redshift supports. See [SELECT](r_SELECT_synopsis.md). 

 *recursive\$1query*   
A UNION ALL query that consists of two SELECT subqueries:  
+ The first SELECT subquery doesn't have a recursive reference to the same *CTE\$1table\$1name*. It returns a result set that is the initial seed of the recursion. This part is called the initial member or seed member.
+ The second SELECT subquery references the same *CTE\$1table\$1name* in its FROM clause. This is called the recursive member. The *recursive\$1query* contains a WHERE condition to end the *recursive\$1query*. 

## Usage notes
<a name="r_WITH_clause-usage-notes"></a>

You can use a WITH clause in the following SQL statements: 
+ SELECT 
+ SELECT INTO
+ CREATE TABLE AS
+ CREATE VIEW
+ DECLARE
+ EXPLAIN
+ INSERT INTO...SELECT 
+ PREPARE
+ UPDATE (within a WHERE clause subquery. You can't define a recursive CTE in the subquery. The recursive CTE must precede the UPDATE clause.)
+ DELETE

If the FROM clause of a query that contains a WITH clause doesn't reference any of the tables defined by the WITH clause, the WITH clause is ignored and the query runs as normal.

A table defined by a WITH clause subquery can be referenced only in the scope of the SELECT query that the WITH clause begins. For example, you can reference such a table in the FROM clause of a subquery in the SELECT list, WHERE clause, or HAVING clause. You can't use a WITH clause in a subquery and reference its table in the FROM clause of the main query or another subquery. This query pattern results in an error message of the form `relation table_name doesn't exist` for the WITH clause table.

You can't specify another WITH clause inside a WITH clause subquery.

You can't make forward references to tables defined by WITH clause subqueries. For example, the following query returns an error because of the forward reference to table W2 in the definition of table W1: 

```
with w1 as (select * from w2), w2 as (select * from w1)
select * from sales;
ERROR:  relation "w2" does not exist
```

A WITH clause subquery may not consist of a SELECT INTO statement; however, you can use a WITH clause in a SELECT INTO statement.

## Recursive common table expressions
<a name="r_WITH_clause-recursive-cte"></a>

A recursive *common table expression (CTE)* is a CTE that references itself. A recursive CTE is useful in querying hierarchical data, such as organization charts that show reporting relationships between employees and managers. See [Example: Recursive CTE](#r_WITH_clause-recursive-cte-example).

Another common use is a multilevel bill of materials, when a product consists of many components and each component itself also consists of other components or subassemblies.

Be sure to limit the depth of recursion by including a WHERE clause in the second SELECT subquery of the recursive query. For an example, see [Example: Recursive CTE](#r_WITH_clause-recursive-cte-example). Otherwise, an error can occur similar to the following:
+ `Recursive CTE out of working buffers.`
+ `Exceeded recursive CTE max rows limit, please add correct CTE termination predicates or change the max_recursion_rows parameter.`

**Note**  
`max_recursion_rows` is a parameter setting the maximum number of rows a recursive CTE can return in order to prevent infinite recursion loops. We recommend against changing this to a larger value than the default. This prevents infinite recursion problems in your queries from taking up excessive space in your cluster.

 You can specify a sort order and limit on the result of the recursive CTE. You can include group by and distinct options on the final result of the recursive CTE.

You can't specify a WITH RECURSIVE clause inside a subquery. The *recursive\$1query* member can't include an order by or limit clause. 

## Examples
<a name="r_WITH_clause-examples"></a>

The following example shows the simplest possible case of a query that contains a WITH clause. The WITH query named VENUECOPY selects all of the rows from the VENUE table. The main query in turn selects all of the rows from VENUECOPY. The VENUECOPY table exists only for the duration of this query. 

```
with venuecopy as (select * from venue)
select * from venuecopy order by 1 limit 10;
```

```
 venueid |         venuename          |    venuecity    | venuestate | venueseats
---------+----------------------------+-----------------+------------+------------
1 | Toyota Park                | Bridgeview      | IL         |          0
2 | Columbus Crew Stadium      | Columbus        | OH         |          0
3 | RFK Stadium                | Washington      | DC         |          0
4 | CommunityAmerica Ballpark  | Kansas City     | KS         |          0
5 | Gillette Stadium           | Foxborough      | MA         |      68756
6 | New York Giants Stadium    | East Rutherford | NJ         |      80242
7 | BMO Field                  | Toronto         | ON         |          0
8 | The Home Depot Center      | Carson          | CA         |          0
9 | Dick's Sporting Goods Park | Commerce City   | CO         |          0
v     10 | Pizza Hut Park             | Frisco          | TX         |          0
(10 rows)
```

The following example shows a WITH clause that produces two tables, named VENUE\$1SALES and TOP\$1VENUES. The second WITH query table selects from the first. In turn, the WHERE clause of the main query block contains a subquery that constrains the TOP\$1VENUES table. 

```
with venue_sales as
(select venuename, venuecity, sum(pricepaid) as venuename_sales
from sales, venue, event
where venue.venueid=event.venueid and event.eventid=sales.eventid
group by venuename, venuecity),

top_venues as
(select venuename
from venue_sales
where venuename_sales > 800000)

select venuename, venuecity, venuestate,
sum(qtysold) as venue_qty,
sum(pricepaid) as venue_sales
from sales, venue, event
where venue.venueid=event.venueid and event.eventid=sales.eventid
and venuename in(select venuename from top_venues)
group by venuename, venuecity, venuestate
order by venuename;
```

```
        venuename       |   venuecity   | venuestate | venue_qty | venue_sales
------------------------+---------------+------------+-----------+-------------
August Wilson Theatre   | New York City | NY         |      3187 |  1032156.00
Biltmore Theatre        | New York City | NY         |      2629 |   828981.00
Charles Playhouse       | Boston        | MA         |      2502 |   857031.00
Ethel Barrymore Theatre | New York City | NY         |      2828 |   891172.00
Eugene O'Neill Theatre  | New York City | NY         |      2488 |   828950.00
Greek Theatre           | Los Angeles   | CA         |      2445 |   838918.00
Helen Hayes Theatre     | New York City | NY         |      2948 |   978765.00
Hilton Theatre          | New York City | NY         |      2999 |   885686.00
Imperial Theatre        | New York City | NY         |      2702 |   877993.00
Lunt-Fontanne Theatre   | New York City | NY         |      3326 |  1115182.00
Majestic Theatre        | New York City | NY         |      2549 |   894275.00
Nederlander Theatre     | New York City | NY         |      2934 |   936312.00
Pasadena Playhouse      | Pasadena      | CA         |      2739 |   820435.00
Winter Garden Theatre   | New York City | NY         |      2838 |   939257.00
(14 rows)
```

The following two examples demonstrate the rules for the scope of table references based on WITH clause subqueries. The first query runs, but the second fails with an expected error. The first query has WITH clause subquery inside the SELECT list of the main query. The table defined by the WITH clause (HOLIDAYS) is referenced in the FROM clause of the subquery in the SELECT list: 

```
select caldate, sum(pricepaid) as daysales,
(with holidays as (select * from date where holiday ='t')
select sum(pricepaid)
from sales join holidays on sales.dateid=holidays.dateid
where caldate='2008-12-25') as dec25sales
from sales join date on sales.dateid=date.dateid
where caldate in('2008-12-25','2008-12-31')
group by caldate
order by caldate;

caldate   | daysales | dec25sales
-----------+----------+------------
2008-12-25 | 70402.00 |   70402.00
2008-12-31 | 12678.00 |   70402.00
(2 rows)
```

The second query fails because it attempts to reference the HOLIDAYS table in the main query as well as in the SELECT list subquery. The main query references are out of scope. 

```
select caldate, sum(pricepaid) as daysales,
(with holidays as (select * from date where holiday ='t')
select sum(pricepaid)
from sales join holidays on sales.dateid=holidays.dateid
where caldate='2008-12-25') as dec25sales
from sales join holidays on sales.dateid=holidays.dateid
where caldate in('2008-12-25','2008-12-31')
group by caldate
order by caldate;

ERROR:  relation "holidays" does not exist
```

## Example: Recursive CTE
<a name="r_WITH_clause-recursive-cte-example"></a>

The following is an example of a recursive CTE that returns the employees who report directly or indirectly to John. The recursive query contains a WHERE clause to limit the depth of recursion to less than 4 levels.

```
--create and populate the sample table
  create table employee (
  id int,
  name varchar (20),
  manager_id int
  );
  
  insert into employee(id, name, manager_id)  values
(100, 'Carlos', null),
(101, 'John', 100),
(102, 'Jorge', 101),
(103, 'Kwaku', 101),
(110, 'Liu', 101),
(106, 'Mateo', 102),
(110, 'Nikki', 103),
(104, 'Paulo', 103),
(105, 'Richard', 103),
(120, 'Saanvi', 104),
(200, 'Shirley', 104),
(201, 'Sofía', 102),
(205, 'Zhang', 104);
  
--run the recursive query
  with recursive john_org(id, name, manager_id, level) as
( select id, name, manager_id, 1 as level
  from employee
  where name = 'John'
  union all
  select e.id, e.name, e.manager_id, level + 1 as next_level
  from employee e, john_org j
  where e.manager_id = j.id and level < 4
  )
 select distinct id, name, manager_id from john_org order by manager_id;
```

Following is the result of the query.

```
    id        name      manager_id
  ------+-----------+--------------
   101    John           100
   102    Jorge          101
   103    Kwaku          101
   110    Liu            101
   201    Sofía          102
   106    Mateo          102
   110    Nikki          103
   104    Paulo          103
   105    Richard        103
   120    Saanvi         104
   200    Shirley        104
   205    Zhang          104
```

Following is an organization chart for John's department.

![\[A diagram of an organization chart for John's department.\]](http://docs.aws.amazon.com/redshift/latest/dg/images/org-chart.png)


# SELECT list
<a name="r_SELECT_list"></a>

**Topics**
+ [Syntax](#r_SELECT_list-synopsis)
+ [Parameters](#r_SELECT_list-parameters)
+ [Usage notes](#r_SELECT_list_usage_notes)
+ [Examples](#r_SELECT_list-examples)

The SELECT list names the columns, functions, and expressions that you want the query to return. The list represents the output of the query. 

For more information about SQL functions, see [SQL functions reference](c_SQL_functions.md). For more information about expressions, see [Conditional expressions](c_conditional_expressions.md).

## Syntax
<a name="r_SELECT_list-synopsis"></a>

```
SELECT
[ TOP number ]
[ ALL | DISTINCT ] * | expression [ AS column_alias ] [, ...]
```

## Parameters
<a name="r_SELECT_list-parameters"></a>

TOP *number*   
TOP takes a positive integer as its argument, which defines the number of rows that are returned to the client. The behavior with the TOP clause is the same as the behavior with the LIMIT clause. The number of rows that is returned is fixed, but the set of rows isn't. To return a consistent set of rows, use TOP or LIMIT in conjunction with an ORDER BY clause. 

ALL   
A redundant keyword that defines the default behavior if you don't specify DISTINCT. `SELECT ALL *` means the same as `SELECT *` (select all rows for all columns and retain duplicates). 

DISTINCT   
Option that eliminates duplicate rows from the result set, based on matching values in one or more columns.   
If your application allows invalid foreign keys or primary keys, it can cause queries to return incorrect results. For example, a SELECT DISTINCT query might return duplicate rows if the primary key column doesn't contain all unique values. For more information, see [Defining table constraints](https://docs.aws.amazon.com/redshift/latest/dg/t_Defining_constraints.html).

\$1 (asterisk)   
Returns the entire contents of the table (all columns and all rows). 

 *expression*   
An expression formed from one or more columns that exist in the tables referenced by the query. An expression can contain SQL functions. For example:   

```
avg(datediff(day, listtime, saletime))
```

AS *column\$1alias*   
A temporary name for the column that is used in the final result set. The AS keyword is optional. For example:   

```
avg(datediff(day, listtime, saletime)) as avgwait
```
If you don't specify an alias for an expression that isn't a simple column name, the result set applies a default name to that column.   
The alias is recognized right after it is defined in the target list. You can use an alias in other expressions defined after it in the same target list. The following example illustrates this.   

```
select clicks / impressions as probability, round(100 * probability, 1) as percentage from raw_data;
```
The benefit of the lateral alias reference is you don't need to repeat the aliased expression when building more complex expressions in the same target list. When Amazon Redshift parses this type of reference, it just inlines the previously defined aliases. If there is a column with the same name defined in the `FROM` clause as the previously aliased expression, the column in the `FROM` clause takes priority. For example, in the above query if there is a column named 'probability' in table raw\$1data, the 'probability' in the second expression in the target list refers to that column instead of the alias name 'probability'. 

## Usage notes
<a name="r_SELECT_list_usage_notes"></a>

TOP is a SQL extension; it provides an alternative to the LIMIT behavior. You can't use TOP and LIMIT in the same query.

## Examples
<a name="r_SELECT_list-examples"></a>

The following example returns 10 rows from the SALES table. Though the query uses the TOP clause, it still returns an unpredictable set of rows because no ORDER BY clause is specified,

```
select top 10 *
from sales;
```

The following query is functionally equivalent, but uses a LIMIT clause instead of a TOP clause:

```
select *
from sales
limit 10;
```

The following example returns the first 10 rows from the SALES table using the TOP clause, ordered by the QTYSOLD column in descending order.

```
select top 10 qtysold, sellerid
from sales
order by qtysold desc, sellerid;

qtysold | sellerid
--------+----------
8 |      518
8 |      520
8 |      574
8 |      718
8 |      868
8 |     2663
8 |     3396
8 |     3726
8 |     5250
8 |     6216
(10 rows)
```

The following example returns the first two QTYSOLD and SELLERID values from the SALES table, ordered by the QTYSOLD column:

```
select top 2 qtysold, sellerid
from sales
order by qtysold desc, sellerid;

qtysold | sellerid
--------+----------
8 |      518
8 |      520
(2 rows)
```

The following example shows the list of distinct category groups from the CATEGORY table:

```
select distinct catgroup from category
order by 1;

catgroup
----------
Concerts
Shows
Sports
(3 rows)

--the same query, run without distinct
select catgroup from category
order by 1;

catgroup
----------
Concerts
Concerts
Concerts
Shows
Shows
Shows
Sports
Sports
Sports
Sports
Sports
(11 rows)
```

The following example returns the distinct set of week numbers for December 2008. Without the DISTINCT clause, the statement would return 31 rows, or one for each day of the month.

```
select distinct week, month, year
from date
where month='DEC' and year=2008
order by 1, 2, 3;

week | month | year
-----+-------+------
49 | DEC   | 2008
50 | DEC   | 2008
51 | DEC   | 2008
52 | DEC   | 2008
53 | DEC   | 2008
(5 rows)
```


# EXCLUDE column\$1list
<a name="r_EXCLUDE_list"></a>

The EXCLUDE column\$1list names the columns that are excluded from the query results. Using the EXCLUDE option is helpful when only a subset of columns need to be excluded from a *wide* table, which is a table that contains many columns. 

**Topics**
+ [Syntax](#r_EXCLUDE_list-synopsis)
+ [Parameters](#r_EXCLUDE_list-parameters)
+ [Examples](#r_EXCLUDE_list-examples)

## Syntax
<a name="r_EXCLUDE_list-synopsis"></a>

```
EXCLUDE column_list
```

## Parameters
<a name="r_EXCLUDE_list-parameters"></a>

 *column\$1list*   
A comma-separated list of one or more column names that exist in the tables referenced by the query. The *column\$1list* can optionally be enclosed in parentheses. Only column names are supported in the exclude list of column names, not expressions (such as `upper(col1)`) or asterisk (\$1).  

```
column-name, ... | ( column-name, ... )
```
For example:   

```
SELECT * EXCLUDE col1, col2 FROM tablea;
```

```
SELECT * EXCLUDE (col1, col2) FROM tablea;
```

## Examples
<a name="r_EXCLUDE_list-examples"></a>

The following examples use the SALES table that contains columns: salesid, listid, sellerid, buyerid, eventid, dateid, qtysold, pricepaid, commission, and saletime. For more information about the SALES table, see [Sample database](c_sampledb.md).

The following example returns rows from the SALES table, but excludes the SALETIME column.

```
SELECT * EXCLUDE saletime FROM sales;

salesid | listid  | sellerid | buyerid | eventid | dateid  | qtysold  | pricepaid  | commission
--------+---------+----------+---------+---------+---------+----------+------------+-----------
150314  | 173969  | 48680    | 816     | 8762    | 1827    | 2        | 688        | 103.2	
8325    | 8942    | 23600    | 1078    | 2557    | 1828    | 5        | 525        |  78.75	
46807   | 52711   | 34388    | 1047    | 2046    | 1828    | 2        | 482        |  72.3	
...
```

The following example returns rows from the SALES table, but excludes the QTYSOLD and SALETIME columns.

```
SELECT * EXCLUDE (qtysold, saletime) FROM sales;

salesid | listid  | sellerid | buyerid | eventid | dateid  | pricepaid  | commission
--------+---------+----------+---------+---------+---------+------------+-----------
150314  | 173969  | 48680    | 816     | 8762    | 1827    | 688        | 103.2	
8325    | 8942    | 23600    | 1078    | 2557    | 1828    | 525        |  78.75	
46807   | 52711   | 34388    | 1047    | 2046    | 1828    | 482        |  72.3	
...
```

The following example creates a view that returns rows from the SALES table, but excludes the SALETIME column.

```
CREATE VIEW sales_view AS SELECT * EXCLUDE saletime FROM sales;
SELECT * FROM sales_view;

salesid | listid  | sellerid | buyerid | eventid | dateid  | qtysold  | pricepaid  | commission
--------+---------+----------+---------+---------+---------+----------+------------+-----------
150314  | 173969  | 48680    | 816     | 8762    | 1827    | 2        | 688        | 103.2	
8325    | 8942    | 23600    | 1078    | 2557    | 1828    | 5        | 525        |  78.75	
46807   | 52711   | 34388    | 1047    | 2046    | 1828    | 2        | 482        |  72.3	
...
```

The following example selects only the columns that are not excluded into a temp table.

```
SELECT * EXCLUDE saletime INTO TEMP temp_sales FROM sales;
SELECT * FROM temp_sales;

salesid | listid  | sellerid | buyerid | eventid | dateid  | qtysold  | pricepaid  | commission
--------+---------+----------+---------+---------+---------+----------+------------+-----------
150314  | 173969  | 48680    | 816     | 8762    | 1827    | 2        | 688        | 103.2	
8325    | 8942    | 23600    | 1078    | 2557    | 1828    | 5        | 525        |  78.75	
46807   | 52711   | 34388    | 1047    | 2046    | 1828    | 2        | 482        |  72.3	
...
```

# FROM clause
<a name="r_FROM_clause30"></a>

The FROM clause in a query lists the table references (tables, views, and subqueries) that data is selected from. If multiple table references are listed, the tables must be joined, using appropriate syntax in either the FROM clause or the WHERE clause. If no join criteria are specified, the system processes the query as a cross-join (Cartesian product). 

**Topics**
+ [Syntax](#r_FROM_clause30-synopsis)
+ [Parameters](#r_FROM_clause30-parameters)
+ [Usage notes](#r_FROM_clause_usage_notes)
+ [PIVOT and UNPIVOT examples](r_FROM_clause-pivot-unpivot-examples.md)
+ [JOIN examples](r_Join_examples.md)
+ [UNNEST examples](r_FROM_clause-unnest-examples.md)

## Syntax
<a name="r_FROM_clause30-synopsis"></a>

```
FROM table_reference [, ...]
```

where *table\$1reference* is one of the following: 

```
with_subquery_table_name [ table_alias ]
table_name [ * ] [ table_alias ]
( subquery ) [ table_alias ]
table_reference [ NATURAL ] join_type table_reference
   [ ON join_condition | USING ( join_column [, ...] ) ]
table_reference  join_type super_expression 
   [ ON join_condition ]
table_reference PIVOT ( 
   aggregate(expr) [ [ AS ] aggregate_alias ]
   FOR column_name IN ( expression [ AS ] in_alias [, ...] )
) [ table_alias ]
table_reference UNPIVOT [ INCLUDE NULLS | EXCLUDE NULLS ] ( 
   value_column_name 
   FOR name_column_name IN ( column_reference [ [ AS ]
   in_alias ] [, ...] )
) [ table_alias ]
UNPIVOT expression AS value_alias [ AT attribute_alias ]
( super_expression.attribute_name ) AS value_alias [ AT index_alias ]
UNNEST ( column_reference )
  [AS] table_alias ( unnested_column_name )
UNNEST ( column_reference ) WITH OFFSET
  [AS] table_alias ( unnested_column_name, [offset_column_name] )
```

The optional *table\$1alias* can be used to give temporary names to tables and complex table references and, if desired, their columns as well, like the following: 

```
[ AS ] alias [ ( column_alias [, ...] ) ]
```

## Parameters
<a name="r_FROM_clause30-parameters"></a>

 *with\$1subquery\$1table\$1name*   
A table defined by a subquery in the [WITH clause](r_WITH_clause.md). 

 *table\$1name*   
Name of a table or view. 

 *alias*   
Temporary alternative name for a table or view. An alias must be supplied for a table derived from a subquery. In other table references, aliases are optional. The AS keyword is always optional. Table aliases provide a convenient shortcut for identifying tables in other parts of a query, such as the WHERE clause. For example:   

```
select * from sales s, listing l
where s.listid=l.listid
```

 *column\$1alias*   
Temporary alternative name for a column in a table or view. 

 *subquery*   
A query expression that evaluates to a table. The table exists only for the duration of the query and is typically given a name or *alias*. However, an alias isn't required. You can also define column names for tables that derive from subqueries. Naming column aliases is important when you want to join the results of subqueries to other tables and when you want to select or constrain those columns elsewhere in the query.   
A subquery may contain an ORDER BY clause, but this clause may have no effect if a LIMIT or OFFSET clause isn't also specified. 

NATURAL   
Defines a join that automatically uses all pairs of identically named columns in the two tables as the joining columns. No explicit join condition is required. For example, if the CATEGORY and EVENT tables both have columns named CATID, a natural join of those tables is a join over their CATID columns.   
If a NATURAL join is specified but no identically named pairs of columns exist in the tables to be joined, the query defaults to a cross-join. 

 *join\$1type*   
Specify one of the following types of join:   
+ [INNER] JOIN 
+ LEFT [OUTER] JOIN 
+ RIGHT [OUTER] JOIN 
+ FULL [OUTER] JOIN 
+ CROSS JOIN 
Cross-joins are unqualified joins; they return the Cartesian product of the two tables.   
Inner and outer joins are qualified joins. They are qualified either implicitly (in natural joins); with the ON or USING syntax in the FROM clause; or with a WHERE clause condition.   
An inner join returns matching rows only, based on the join condition or list of joining columns. An outer join returns all of the rows that the equivalent inner join would return plus non-matching rows from the "left" table, "right" table, or both tables. The left table is the first-listed table, and the right table is the second-listed table. The non-matching rows contain NULL values to fill the gaps in the output columns. 

ON *join\$1condition*   
Type of join specification where the joining columns are stated as a condition that follows the ON keyword. For example:   

```
sales join listing
on sales.listid=listing.listid and sales.eventid=listing.eventid
```

USING ( *join\$1column* [, ...] )   
Type of join specification where the joining columns are listed in parentheses. If multiple joining columns are specified, they are delimited by commas. The USING keyword must precede the list. For example:   

```
sales join listing
using (listid,eventid)
```

PIVOT  
Rotates output from rows to columns, for the purpose of representing tabular data in a format that is easy to read. Output is represented horizontally across multiple columns. PIVOT is similar to a GROUP BY query with an aggregation, using an aggregate expression to specify an output format. However, in contrast to GROUP BY, the results are returned in columns instead of rows.  
For examples that show how to query with PIVOT and UNPIVOT, see [PIVOT and UNPIVOT examples](r_FROM_clause-pivot-unpivot-examples.md).

UNPIVOT  
*Rotating columns into rows with UNPIVOT* – The operator transforms result columns, from an input table or query results, into rows, to make the output easier to read. UNPIVOT combines the data of its input columns into two result columns: a name column and a value column. The name column contains column names from the input, as row entries. The value column contains values from the input columns, such as results of an aggregation. For example, the counts of items in various categories.  
*Object unpivoting with UNPIVOT (SUPER)* – You can perform object unpivoting, where *expression* is a SUPER expression referring to another FROM clause item. For more information, see [Object unpivoting](query-super.md#unpivoting). It also has examples that show how to query semi-structured data, such as data that's JSON-formatted.

*super\$1expression*  
A valid SUPER expression. Amazon Redshift returns one row for each value in the specified attribute. For more information on the SUPER data type, see [SUPER type](r_SUPER_type.md). For more information about unnested SUPER values, see [Unnesting queries](query-super.md#unnest).

*attribute\$1name*  
The name of an attribute in the SUPER expression.

*index\$1alias*  
Alias for the index that signifies the value's position in the SUPER expression.

UNNEST  
Expands a nested structure, typically a SUPER array, into columns containing the unnested elements. For more information on unnesting SUPER data, see [Querying semi-structured data](query-super.md). For examples, see [UNNEST examples](r_FROM_clause-unnest-examples.md). 

*unnested\$1column\$1name*  
The name of the column that contains the unnested elements. 

UNNEST ... WITH OFFSET  
Adds an offset column to the unnested output, with the offset representing the zero-based index of each element in the array. This variant is useful when you want to see the position of elements within an array. For more information on unnesting SUPER data, see [Querying semi-structured data](query-super.md). For examples, see [UNNEST examples](r_FROM_clause-unnest-examples.md). 

*offset\$1column\$1name*  
A custom name for the offset column that lets you explicitly define how the index column will appear in the output. This parameter is optional. By default, the offset column name is `offset_col`. 

## Usage notes
<a name="r_FROM_clause_usage_notes"></a>

Joining columns must have comparable data types. 

A NATURAL or USING join retains only one of each pair of joining columns in the intermediate result set. 

A join with the ON syntax retains both joining columns in its intermediate result set. 

See also [WITH clause](r_WITH_clause.md). 

# PIVOT and UNPIVOT examples
<a name="r_FROM_clause-pivot-unpivot-examples"></a>

PIVOT and UNPIVOT are parameters in the FROM clause that rotate query output from rows to columns and columns to rows, respectively. They represent tabular query results in a format that's easy to read. The following examples use test data and queries to show how to use them.

For more information about these and other parameters, see [FROM clause](https://docs.aws.amazon.com/redshift/latest/dg/r_FROM_clause30.html).

## PIVOT examples
<a name="r_FROM_clause-pivot-examples"></a>

Set up the sample table and data and use them to run the subsequent example queries.

```
CREATE TABLE part (
    partname varchar,
    manufacturer varchar,
    quality int,
    price decimal(12, 2)
);

INSERT INTO part VALUES ('prop', 'local parts co', 2, 10.00);
INSERT INTO part VALUES ('prop', 'big parts co', NULL, 9.00);
INSERT INTO part VALUES ('prop', 'small parts co', 1, 12.00);

INSERT INTO part VALUES ('rudder', 'local parts co', 1, 2.50);
INSERT INTO part VALUES ('rudder', 'big parts co', 2, 3.75);
INSERT INTO part VALUES ('rudder', 'small parts co', NULL, 1.90);

INSERT INTO part VALUES ('wing', 'local parts co', NULL, 7.50);
INSERT INTO part VALUES ('wing', 'big parts co', 1, 15.20);
INSERT INTO part VALUES ('wing', 'small parts co', NULL, 11.80);
```

PIVOT on `partname` with an `AVG` aggregation on `price`.

```
SELECT *
FROM (SELECT partname, price FROM part) PIVOT (
    AVG(price) FOR partname IN ('prop', 'rudder', 'wing')
);
```

The query results in the following output.

```
  prop   |  rudder  |  wing
---------+----------+---------
 10.33   | 2.71     | 11.50
```

In the previous example, the results are transformed into columns. The following example shows a `GROUP BY` query that returns the average prices in rows, rather than in columns.

```
SELECT partname, avg(price)
FROM (SELECT partname, price FROM part)
WHERE partname IN ('prop', 'rudder', 'wing')
GROUP BY partname;
```

The query results in the following output.

```
 partname |  avg
----------+-------
 prop     | 10.33
 rudder   |  2.71
 wing     | 11.50
```

A `PIVOT` example with `manufacturer` as an implicit column.

```
SELECT *
FROM (SELECT quality, manufacturer FROM part) PIVOT (
    count(*) FOR quality IN (1, 2, NULL)
);
```

The query results in the following output.

```
 manufacturer      | 1  | 2  | null
-------------------+----+----+------
 local parts co    | 1  | 1  |  1
 big parts co      | 1  | 1  |  1
 small parts co    | 1  | 0  |  2
```

 Input table columns that are not referenced in the `PIVOT` definition are added implicitly to the result table. This is the case for the `manufacturer` column in the previous example. The example also shows that `NULL` is a valid value for the `IN` operator. 

`PIVOT` in the above example returns similar information as the following query, which includes `GROUP BY`. The difference is that `PIVOT` returns the value `0` for column `2` and the manufacturer `small parts co`. The `GROUP BY` query does not contain a corresponding row. In most cases, `PIVOT` inserts `NULL` if a row doesn't have input data for a given column. However, the count aggregate doesn't return `NULL` and `0` is the default value.

```
SELECT manufacturer, quality, count(*)
FROM (SELECT quality, manufacturer FROM part)
WHERE quality IN (1, 2) OR quality IS NULL
GROUP BY manufacturer, quality
ORDER BY manufacturer;
```

The query results in the following output.

```
 manufacturer        | quality | count
---------------------+---------+-------
 big parts co        |         |     1
 big parts co        |       2 |     1
 big parts co        |       1 |     1
 local parts co      |       2 |     1
 local parts co      |       1 |     1
 local parts co      |         |     1
 small parts co      |       1 |     1
 small parts co      |         |     2
```

 The PIVOT operator accepts optional aliases on the aggregate expression and on each value for the `IN` operator. Use aliases to customize the column names. If there is no aggregate alias, only the `IN` list aliases are used. Otherwise, the aggregate alias is appended to the column name with an underscore to separate the names. 

```
SELECT *
FROM (SELECT quality, manufacturer FROM part) PIVOT (
    count(*) AS count FOR quality IN (1 AS high, 2 AS low, NULL AS na)
);
```

The query results in the following output.

```
 manufacturer      | high_count  | low_count | na_count
-------------------+-------------+-----------+----------
 local parts co    |           1 |         1 |        1
 big parts co      |           1 |         1 |        1
 small parts co    |           1 |         0 |        2
```

Set up the following sample table and data and use them to run the subsequent example queries. The data represents booking dates for a collection of hotels.

```
CREATE TABLE bookings (
    booking_id int,
    hotel_code char(8),
    booking_date date,
    price decimal(12, 2)
);

INSERT INTO bookings VALUES (1, 'FOREST_L', '02/01/2023', 75.12);
INSERT INTO bookings VALUES (2, 'FOREST_L', '02/02/2023', 75.00);
INSERT INTO bookings VALUES (3, 'FOREST_L', '02/04/2023', 85.54);

INSERT INTO bookings VALUES (4, 'FOREST_L', '02/08/2023', 75.00);
INSERT INTO bookings VALUES (5, 'FOREST_L', '02/11/2023', 75.00);
INSERT INTO bookings VALUES (6, 'FOREST_L', '02/14/2023', 90.00);

INSERT INTO bookings VALUES (7, 'FOREST_L', '02/21/2023', 60.00);
INSERT INTO bookings VALUES (8, 'FOREST_L', '02/22/2023', 85.00);
INSERT INTO bookings VALUES (9, 'FOREST_L', '02/27/2023', 90.00);

INSERT INTO bookings VALUES (10, 'DESERT_S', '02/01/2023', 98.00);
INSERT INTO bookings VALUES (11, 'DESERT_S', '02/02/2023', 75.00);
INSERT INTO bookings VALUES (12, 'DESERT_S', '02/04/2023', 85.00);

INSERT INTO bookings VALUES (13, 'DESERT_S', '02/05/2023', 75.00);
INSERT INTO bookings VALUES (14, 'DESERT_S', '02/06/2023', 34.00);
INSERT INTO bookings VALUES (15, 'DESERT_S', '02/09/2023', 85.00);

INSERT INTO bookings VALUES (16, 'DESERT_S', '02/12/2023', 23.00);
INSERT INTO bookings VALUES (17, 'DESERT_S', '02/13/2023', 76.00);
INSERT INTO bookings VALUES (18, 'DESERT_S', '02/14/2023', 85.00);

INSERT INTO bookings VALUES (19, 'OCEAN_WV', '02/01/2023', 98.00);
INSERT INTO bookings VALUES (20, 'OCEAN_WV', '02/02/2023', 75.00);
INSERT INTO bookings VALUES (21, 'OCEAN_WV', '02/04/2023', 85.00);

INSERT INTO bookings VALUES (22, 'OCEAN_WV', '02/06/2023', 75.00);
INSERT INTO bookings VALUES (23, 'OCEAN_WV', '02/09/2023', 34.00);
INSERT INTO bookings VALUES (24, 'OCEAN_WV', '02/12/2023', 85.00);

INSERT INTO bookings VALUES (25, 'OCEAN_WV', '02/13/2023', 23.00);
INSERT INTO bookings VALUES (26, 'OCEAN_WV', '02/14/2023', 76.00);
INSERT INTO bookings VALUES (27, 'OCEAN_WV', '02/16/2023', 85.00);

INSERT INTO bookings VALUES (28, 'CITY_BLD', '02/01/2023', 98.00);
INSERT INTO bookings VALUES (29, 'CITY_BLD', '02/02/2023', 75.00);
INSERT INTO bookings VALUES (30, 'CITY_BLD', '02/04/2023', 85.00);

INSERT INTO bookings VALUES (31, 'CITY_BLD', '02/12/2023', 75.00);
INSERT INTO bookings VALUES (32, 'CITY_BLD', '02/13/2023', 34.00);
INSERT INTO bookings VALUES (33, 'CITY_BLD', '02/17/2023', 85.00);

INSERT INTO bookings VALUES (34, 'CITY_BLD', '02/22/2023', 23.00);
INSERT INTO bookings VALUES (35, 'CITY_BLD', '02/23/2023', 76.00);
INSERT INTO bookings VALUES (36, 'CITY_BLD', '02/24/2023', 85.00);
```

 In this sample query, booking records are tallied to give a total for each week. The end date for each week becomes a column name.

```
SELECT * FROM
    (SELECT
       booking_id,
       (date_trunc('week', booking_date::date) + '5 days'::interval)::date as enddate,
       hotel_code AS "hotel code"
FROM bookings
) PIVOT (
    count(booking_id) FOR enddate IN ('2023-02-04','2023-02-11','2023-02-18') 
);
```

The query results in the following output.

```
 hotel code | 2023-02-04  | 2023-02-11 | 2023-02-18
------------+-------------+------------+----------
 FOREST_L   |           3 |          2 |        1
 DESERT_S   |           4 |          3 |        2
 OCEAN_WV   |           3 |          3 |        3
 CITY_BLD   |           3 |          1 |        2
```

 Amazon Redshift doesn't support CROSSTAB to pivot on multiple columns. But you can change row data to columns, in a similar manner to an aggregation with PIVOT, with a query like the following. This uses the same booking sample data as the previous example.

```
SELECT 
  booking_date,
  MAX(CASE WHEN hotel_code = 'FOREST_L' THEN 'forest is booked' ELSE '' END) AS FOREST_L,
  MAX(CASE WHEN hotel_code = 'DESERT_S' THEN 'desert is booked' ELSE '' END) AS DESERT_S,
  MAX(CASE WHEN hotel_code = 'OCEAN_WV' THEN 'ocean is booked' ELSE '' END)  AS OCEAN_WV
FROM bookings
GROUP BY booking_date
ORDER BY booking_date asc;
```

The sample query results in booking dates listed next to short phrases that indicate which hotels are booked.

```
 booking_date  | forest_l         | desert_s         | ocean_wv
---------------+------------------+------------------+--------------------
 2023-02-01    | forest is booked | desert is booked |  ocean is booked
 2023-02-02    | forest is booked | desert is booked |  ocean is booked
 2023-02-04    | forest is booked | desert is booked |  ocean is booked
 2023-02-05    |                  | desert is booked |        
 2023-02-06    |                  | desert is booked |
```

The following are usage notes for `PIVOT`:
+ `PIVOT` can be applied to tables, sub-queries, and common table expressions (CTEs). `PIVOT` cannot be applied to any `JOIN` expressions, recursive CTEs, `PIVOT`, or `UNPIVOT` expressions. Also not supported are `SUPER` unnested expressions and Redshift Spectrum nested tables.
+  `PIVOT` supports the `COUNT`, `SUM`, `MIN`, `MAX`, and `AVG` aggregate functions. 
+ The `PIVOT` aggregate expression has to be a call of a supported aggregate function. Complex expressions on top of the aggregate are not supported. The aggregate arguments cannot contain references to tables other than the `PIVOT` input table. Correlated references to a parent query are also not supported. The aggregate argument may contain sub-queries. These can be correlated internally or on the `PIVOT` input table.
+  The `PIVOT IN` list values cannot be column references or sub-queries. Each value must be type compatible with the `FOR` column reference. 
+  If the `IN` list values do not have aliases, `PIVOT` generates default column names. For constant `IN` values such as ‘abc’ or 5 the default column name is the constant itself. For any complex expression, the column name is a standard Amazon Redshift default name such as `?column?`. 

## UNPIVOT examples
<a name="r_FROM_clause-unpivot-examples"></a>

Set up the sample data and use it to run the subsequent examples.

```
CREATE TABLE count_by_color (quality varchar, red int, green int, blue int);

INSERT INTO count_by_color VALUES ('high', 15, 20, 7);
INSERT INTO count_by_color VALUES ('normal', 35, NULL, 40);
INSERT INTO count_by_color VALUES ('low', 10, 23, NULL);
```

`UNPIVOT` on input columns red, green, and blue.

```
SELECT *
FROM (SELECT red, green, blue FROM count_by_color) UNPIVOT (
    cnt FOR color IN (red, green, blue)
);
```

The query results in the following output.

```
 color | cnt
-------+-----
 red   |  15
 red   |  35
 red   |  10
 green |  20
 green |  23
 blue  |   7
 blue  |  40
```

By default, `NULL` values in the input column are skipped and do not yield a result row. 

The following example shows `UNPIVOT` with `INCLUDE NULLS`.

```
SELECT *
FROM (
    SELECT red, green, blue
    FROM count_by_color
) UNPIVOT INCLUDE NULLS (
    cnt FOR color IN (red, green, blue)
);
```

The following is the resulting output.

```
 color | cnt
-------+-----
 red   |  15
 red   |  35
 red   |  10
 green |  20
 green |
 green |  23
 blue  |   7
 blue  |  40
 blue  |
```

If the `INCLUDING NULLS` parameter is set, `NULL` input values generate result rows.

`The following query shows UNPIVOT` with `quality` as an implicit column.

```
SELECT *
FROM count_by_color UNPIVOT (
    cnt FOR color IN (red, green, blue)
);
```

The query results in the following output.

```
 quality | color | cnt
---------+-------+-----
 high    | red   |  15
 normal  | red   |  35
 low     | red   |  10
 high    | green |  20
 low     | green |  23
 high    | blue  |   7
 normal  | blue  |  40
```

Columns of the input table that are not referenced in the `UNPIVOT` definition are added implicitly to the result table. In the example, this is the case for the `quality` column.

The following example shows `UNPIVOT` with aliases for values in the `IN` list.

```
SELECT *
FROM count_by_color UNPIVOT (
    cnt FOR color IN (red AS r, green AS g, blue AS b)
);
```

The previous query results in the following output.

```
 quality | color | cnt
---------+-------+-----
 high    | r     |  15
 normal  | r     |  35
 low     | r     |  10
 high    | g     |  20
 low     | g     |  23
 high    | b     |   7
 normal  | b     |  40
```

The `UNPIVOT` operator accepts optional aliases on each `IN` list value. Each alias provides customization of the data in each `value` column.

The following are usage notes for `UNPIVOT`.
+ `UNPIVOT` can be applied to tables, sub-queries, and common table expressions (CTEs). `UNPIVOT` cannot be applied to any `JOIN` expressions, recursive CTEs, `PIVOT`, or `UNPIVOT` expressions. Also not supported are `SUPER` unnested expressions and Redshift Spectrum nested tables.
+ The `UNPIVOT IN` list must contain only input table column references. The `IN` list columns must have a common type that they are all compatible with. The `UNPIVOT` value column has this common type. The `UNPIVOT` name column is of type `VARCHAR`.
+ If an `IN` list value does not have an alias, `UNPIVOT` uses the column name as a default value.

# JOIN examples
<a name="r_Join_examples"></a>

A SQL JOIN clause is used to combine the data from two or more tables based on common fields. The results might or might not change depending on the join method specified. For more information about the syntax of a JOIN clause, see [Parameters](r_FROM_clause30.md#r_FROM_clause30-parameters). 

The following examples use data from the `TICKIT` sample data. For more information about the database schema, see [Sample database](c_sampledb.md). To learn how to load sample data, see [Loading data](https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-create-sample-db.html) in the *Amazon Redshift Getting Started Guide*.

The following query is an inner join (without the JOIN keyword) between the LISTING table and SALES table, where the LISTID from the LISTING table is between 1 and 5. This query matches LISTID column values in the LISTING table (the left table) and SALES table (the right table). The results show that LISTID 1, 4, and 5 match the criteria.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from listing, sales
where listing.listid = sales.listid
and listing.listid between 1 and 5
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     1 | 728.00 | 109.20
     4 |  76.00 |  11.40
     5 | 525.00 |  78.75
```

The following query is a left outer join. Left and right outer joins retain values from one of the joined tables when no match is found in the other table. The left and right tables are the first and second tables listed in the syntax. NULL values are used to fill the "gaps" in the result set. This query matches LISTID column values in the LISTING table (the left table) and the SALES table (the right table). The results show that LISTIDs 2 and 3 did not result in any sales.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from listing left outer join sales on sales.listid = listing.listid
where listing.listid between 1 and 5
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     1 | 728.00 | 109.20
     2 | NULL   | NULL
     3 | NULL   | NULL
     4 |  76.00 |  11.40
     5 | 525.00 |  78.75
```

The following query is a right outer join. This query matches LISTID column values in the LISTING table (the left table) and the SALES table (the right table). The results show that LISTIDs 1, 4, and 5 match the criteria.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from listing right outer join sales on sales.listid = listing.listid
where listing.listid between 1 and 5
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     1 | 728.00 | 109.20
     4 |  76.00 |  11.40
     5 | 525.00 |  78.75
```

The following query is a full join. Full joins retain values from the joined tables when no match is found in the other table. The left and right tables are the first and second tables listed in the syntax. NULL values are used to fill the "gaps" in the result set. This query matches LISTID column values in the LISTING table (the left table) and the SALES table (the right table). The results show that LISTIDs 2 and 3 did not result in any sales.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from listing full join sales on sales.listid = listing.listid
where listing.listid between 1 and 5
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     1 | 728.00 | 109.20
     2 | NULL   | NULL
     3 | NULL   | NULL
     4 |  76.00 |  11.40
     5 | 525.00 |  78.75
```

The following query is a full join. This query matches LISTID column values in the LISTING table (the left table) and the SALES table (the right table). Only rows that do not result in any sales (LISTIDs 2 and 3) are in the results.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from listing full join sales on sales.listid = listing.listid
where listing.listid between 1 and 5
and (listing.listid IS NULL or sales.listid IS NULL)
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     2 | NULL   | NULL
     3 | NULL   | NULL
```

The following example is an inner join with the ON clause. In this case, NULL rows are not returned.

```
select listing.listid, sum(pricepaid) as price, sum(commission) as comm
from sales join listing
on sales.listid=listing.listid and sales.eventid=listing.eventid
where listing.listid between 1 and 5
group by 1
order by 1;

listid | price  |  comm
-------+--------+--------
     1 | 728.00 | 109.20
     4 |  76.00 |  11.40
     5 | 525.00 |  78.75
```

The following query is a cross join or Cartesian join of the LISTING table and the SALES table with a predicate to limit the results. This query matches LISTID column values in the SALES table and the LISTING table for LISTIDs 1, 2, 3, 4, and 5 in both tables. The results show that 20 rows match the criteria.

```
select sales.listid as sales_listid, listing.listid as listing_listid
from sales cross join listing
where sales.listid between 1 and 5
and listing.listid between 1 and 5
order by 1,2;

sales_listid | listing_listid
-------------+---------------
1            | 1
1            | 2
1            | 3
1            | 4
1            | 5
4            | 1
4            | 2
4            | 3
4            | 4
4            | 5
5            | 1
5            | 1
5            | 2
5            | 2
5            | 3
5            | 3
5            | 4
5            | 4
5            | 5
5            | 5
```

The following example is a natural join between two tables. In this case, the columns listid, sellerid, eventid, and dateid have identical names and data types in both tables and so are used as the join columns. The results are limited to five rows.

```
select listid, sellerid, eventid, dateid, numtickets
from listing natural join sales
order by 1
limit 5;

listid | sellerid  | eventid | dateid | numtickets
-------+-----------+---------+--------+-----------
113    | 29704     | 4699    | 2075   | 22
115    | 39115     | 3513    | 2062   | 14
116    | 43314     | 8675    | 1910   | 28
118    | 6079      | 1611    | 1862   | 9
163    | 24880     | 8253    | 1888   | 14
```

The following example is a join between two tables with the USING clause. In this case, the columns listid and eventid are used as the join columns. The results are limited to five rows.

```
select listid, listing.sellerid, eventid, listing.dateid, numtickets
from listing join sales
using (listid, eventid)
order by 1
limit 5;

listid | sellerid | eventid | dateid | numtickets
-------+----------+---------+--------+-----------
1      | 36861    | 7872    | 1850   | 10
4      | 8117     | 4337    | 1970   | 8
5      | 1616     | 8647    | 1963   | 4
5      | 1616     | 8647    | 1963   | 4
6      | 47402    | 8240    | 2053   | 18
```

The following query is an inner join of two subqueries in the FROM clause. The query finds the number of sold and unsold tickets for different categories of events (concerts and shows). The FROM clause subqueries are *table* subqueries; they can return multiple columns and rows.

```
select catgroup1, sold, unsold
from
(select catgroup, sum(qtysold) as sold
from category c, event e, sales s
where c.catid = e.catid and e.eventid = s.eventid
group by catgroup) as a(catgroup1, sold)
join
(select catgroup, sum(numtickets)-sum(qtysold) as unsold
from category c, event e, sales s, listing l
where c.catid = e.catid and e.eventid = s.eventid
and s.listid = l.listid
group by catgroup) as b(catgroup2, unsold)

on a.catgroup1 = b.catgroup2
order by 1;

catgroup1 |  sold  | unsold
----------+--------+--------
Concerts  | 195444 |1067199
Shows     | 149905 | 817736
```

# UNNEST examples
<a name="r_FROM_clause-unnest-examples"></a>

UNNEST is a parameter in the FROM clause that expands nested data into columns that hold the data’s unnested elements. For information on unnesting data, see [Querying semi-structured data](query-super.md).

The following statement creates and populates the `orders` table, which contains a `products` column containing arrays of product IDs. The examples in this section use the sample data in this table. 

```
CREATE TABLE orders (
    order_id INT,
    products SUPER
);

-- Populate table
INSERT INTO orders VALUES
(1001, JSON_PARSE('[
        {
            "product_id": "P456",
            "name": "Monitor",
            "price": 299.99,
            "quantity": 1,
            "specs": {
                "size": "27 inch",
                "resolution": "4K"
            }
        }
    ]
')),
(1002, JSON_PARSE('
    [
        {
            "product_id": "P567",
            "name": "USB Cable",
            "price": 9.99,
            "quantity": 3
        },
        {
            "product_id": "P678",
            "name": "Headphones",
            "price": 159.99,
            "quantity": 1,
            "specs": {
                "type": "Wireless",
                "battery_life": "20 hours"
            }
        }
    ]
'));
```

Following are some examples of unnesting queries with the sample data using PartiQL syntax.

## Unnesting an array without an OFFSET column
<a name="r_FROM_clause-unnest-examples-no-offset"></a>

The following query unnests the SUPER arrays in the products column, with each row representing an item from the order in `order_id`.

```
SELECT o.order_id, unnested_products.product
FROM orders o, UNNEST(o.products) AS unnested_products(product);

 order_id |                                                           product                                                           
----------+-----------------------------------------------------------------------------------------------------------------------------
     1001 | {"product_id":"P456","name":"Monitor","price":299.99,"quantity":1,"specs":{"size":"27 inch","resolution":"4K"}}
     1002 | {"product_id":"P567","name":"USB Cable","price":9.99,"quantity":3}
     1002 | {"product_id":"P678","name":"Headphones","price":159.99,"quantity":1,"specs":{"type":"Wireless","battery_life":"20 hours"}}
(3 rows)
```

The following query finds the most expensive product in each order.

```
SELECT o.order_id, MAX(unnested_products.product)
FROM orders o, UNNEST(o.products) AS unnested_products(product);

 order_id |                                                           product                                                           
----------+-----------------------------------------------------------------------------------------------------------------------------
     1001 | {"product_id":"P456","name":"Monitor","price":299.99,"quantity":1,"specs":{"size":"27 inch","resolution":"4K"}}
     1002 | {"product_id":"P678","name":"Headphones","price":159.99,"quantity":1,"specs":{"type":"Wireless","battery_life":"20 hours"}}
(2 rows)
```

## Unnesting an array with an implicit OFFSET column
<a name="r_FROM_clause-unnest-examples-implicit-offset"></a>

The following query uses the `UNNEST ... WITH OFFSET` parameter to show the zero-based position of each product within its order array.

```
SELECT o.order_id, up.product, up.offset_col
FROM orders o, UNNEST(o.products) WITH OFFSET AS up(product);

 order_id |                                                           product                                                           | offset_col 
----------+-----------------------------------------------------------------------------------------------------------------------------+------------
     1001 | {"product_id":"P456","name":"Monitor","price":299.99,"quantity":1,"specs":{"size":"27 inch","resolution":"4K"}}             |          0
     1002 | {"product_id":"P567","name":"USB Cable","price":9.99,"quantity":3}                                                          |          0
     1002 | {"product_id":"P678","name":"Headphones","price":159.99,"quantity":1,"specs":{"type":"Wireless","battery_life":"20 hours"}} |          1
(3 rows)
```

Since the statement doesn’t specify an alias for the offset column, Amazon Redshift defaults to naming it `offset_col`.

## Unnesting an array with an explicit OFFSET column
<a name="r_FROM_clause-unnest-examples-explicit-offset"></a>

The following query also uses the `UNNEST ... WITH OFFSET` parameter to show the products within their order arrays. The difference in this query compared to the query in the previous example is that it explicitly names the offset column with the alias `idx`.

```
SELECT o.order_id, up.product, up.idx
FROM orders o, UNNEST(o.products) WITH OFFSET AS up(product, idx);

 order_id |                                                           product                                                           | idx 
----------+-----------------------------------------------------------------------------------------------------------------------------+-----
     1001 | {"product_id":"P456","name":"Monitor","price":299.99,"quantity":1,"specs":{"size":"27 inch","resolution":"4K"}}             |   0
     1002 | {"product_id":"P567","name":"USB Cable","price":9.99,"quantity":3}                                                          |   0
     1002 | {"product_id":"P678","name":"Headphones","price":159.99,"quantity":1,"specs":{"type":"Wireless","battery_life":"20 hours"}} |   1
(3 rows)
```

# WHERE clause
<a name="r_WHERE_clause"></a>

The WHERE clause contains conditions that either join tables or apply predicates to columns in tables. Tables can be inner-joined by using appropriate syntax in either the WHERE clause or the FROM clause. Outer join criteria must be specified in the FROM clause. 

## Syntax
<a name="r_WHERE_clause-synopsis"></a>

```
[ WHERE condition ]
```

## *condition*
<a name="r_WHERE_clause-synopsis-condition"></a>

Any search condition with a Boolean result, such as a join condition or a predicate on a table column. The following examples are valid join conditions: 

```
sales.listid=listing.listid
sales.listid<>listing.listid
```

The following examples are valid conditions on columns in tables: 

```
catgroup like 'S%'
venueseats between 20000 and 50000
eventname in('Jersey Boys','Spamalot')
year=2008
length(catdesc)>25
date_part(month, caldate)=6
```

Conditions can be simple or complex; for complex conditions, you can use parentheses to isolate logical units. In the following example, the join condition is enclosed by parentheses. 

```
where (category.catid=event.catid) and category.catid in(6,7,8)
```

## Usage notes
<a name="r_WHERE_clause_usage_notes"></a>

You can use aliases in the WHERE clause to reference select list expressions. 

You can't restrict the results of aggregate functions in the WHERE clause; use the HAVING clause for this purpose. 

Columns that are restricted in the WHERE clause must derive from table references in the FROM clause. 

## Example
<a name="r_SELECT_synopsis-example"></a>

The following query uses a combination of different WHERE clause restrictions, including a join condition for the SALES and EVENT tables, a predicate on the EVENTNAME column, and two predicates on the STARTTIME column. 

```
select eventname, starttime, pricepaid/qtysold as costperticket, qtysold
from sales, event
where sales.eventid = event.eventid
and eventname='Hannah Montana'
and date_part(quarter, starttime) in(1,2)
and date_part(year, starttime) = 2008
order by 3 desc, 4, 2, 1 limit 10;

eventname    |      starttime      |   costperticket   | qtysold
----------------+---------------------+-------------------+---------
Hannah Montana | 2008-06-07 14:00:00 |     1706.00000000 |       2
Hannah Montana | 2008-05-01 19:00:00 |     1658.00000000 |       2
Hannah Montana | 2008-06-07 14:00:00 |     1479.00000000 |       1
Hannah Montana | 2008-06-07 14:00:00 |     1479.00000000 |       3
Hannah Montana | 2008-06-07 14:00:00 |     1163.00000000 |       1
Hannah Montana | 2008-06-07 14:00:00 |     1163.00000000 |       2
Hannah Montana | 2008-06-07 14:00:00 |     1163.00000000 |       4
Hannah Montana | 2008-05-01 19:00:00 |      497.00000000 |       1
Hannah Montana | 2008-05-01 19:00:00 |      497.00000000 |       2
Hannah Montana | 2008-05-01 19:00:00 |      497.00000000 |       4
(10 rows)
```

# Oracle-Style outer joins in the WHERE clause
<a name="r_WHERE_oracle_outer"></a>

For Oracle compatibility, Amazon Redshift supports the Oracle outer-join operator (\$1) in WHERE clause join conditions. This operator is intended for use only in defining outer-join conditions; don't try to use it in other contexts. Other uses of this operator are silently ignored in most cases. 

An outer join returns all of the rows that the equivalent inner join would return, plus non-matching rows from one or both tables. In the FROM clause, you can specify left, right, and full outer joins. In the WHERE clause, you can specify left and right outer joins only. 

To outer join tables TABLE1 and TABLE2 and return non-matching rows from TABLE1 (a left outer join), specify `TABLE1 LEFT OUTER JOIN TABLE2` in the FROM clause or apply the (\$1) operator to all joining columns from TABLE2 in the WHERE clause. For all rows in TABLE1 that have no matching rows in TABLE2, the result of the query contains nulls for any select list expressions that contain columns from TABLE2. 

To produce the same behavior for all rows in TABLE2 that have no matching rows in TABLE1, specify `TABLE1 RIGHT OUTER JOIN TABLE2` in the FROM clause or apply the (\$1) operator to all joining columns from TABLE1 in the WHERE clause. 

## Basic syntax
<a name="r_WHERE_oracle_outer-basic-syntax"></a>

```
[ WHERE {
[ table1.column1 = table2.column1(+) ]
[ table1.column1(+) = table2.column1 ]
}
```

The first condition is equivalent to: 

```
from table1 left outer join table2
on table1.column1=table2.column1
```

The second condition is equivalent to: 

```
from table1 right outer join table2
on table1.column1=table2.column1
```

**Note**  
The syntax shown here covers the simple case of an equijoin over one pair of joining columns. However, other types of comparison conditions and multiple pairs of joining columns are also valid. 

For example, the following WHERE clause defines an outer join over two pairs of columns. The (\$1) operator must be attached to the same table in both conditions: 

```
where table1.col1 > table2.col1(+)
and table1.col2 = table2.col2(+)
```

## Usage notes
<a name="r_WHERE_oracle_outer_usage_notes"></a>

Where possible, use the standard FROM clause OUTER JOIN syntax instead of the (\$1) operator in the WHERE clause. Queries that contain the (\$1) operator are subject to the following rules: 
+ You can only use the (\$1) operator in the WHERE clause, and only in reference to columns from tables or views. 
+ You can't apply the (\$1) operator to expressions. However, an expression can contain columns that use the (\$1) operator. For example, the following join condition returns a syntax error: 

  ```
  event.eventid*10(+)=category.catid
  ```

  However, the following join condition is valid: 

  ```
  event.eventid(+)*10=category.catid
  ```
+ You can't use the (\$1) operator in a query block that also contains FROM clause join syntax. 
+ If two tables are joined over multiple join conditions, you must use the (\$1) operator in all or none of these conditions. A join with mixed syntax styles runs as an inner join, without warning. 
+ The (\$1) operator doesn't produce an outer join if you join a table in the outer query with a table that results from an inner query. 
+ To use the (\$1) operator to outer-join a table to itself, you must define table aliases in the FROM clause and reference them in the join condition: 

  ```
  select count(*)
  from event a, event b
  where a.eventid(+)=b.catid;
  
  count
  -------
  8798
  (1 row)
  ```
+ You can't combine a join condition that contains the (\$1) operator with an OR condition or an IN condition. For example: 

  ```
  select count(*) from sales, listing
  where sales.listid(+)=listing.listid or sales.salesid=0;
  ERROR:  Outer join operator (+) not allowed in operand of OR or IN.
  ```
+  In a WHERE clause that outer-joins more than two tables, the (\$1) operator can be applied only once to a given table. In the following example, the SALES table can't be referenced with the (\$1) operator in two successive joins. 

  ```
  select count(*) from sales, listing, event
  where sales.listid(+)=listing.listid and sales.dateid(+)=date.dateid;
  ERROR:  A table may be outer joined to at most one other table.
  ```
+  If the WHERE clause outer-join condition compares a column from TABLE2 with a constant, apply the (\$1) operator to the column. If you don't include the operator, the outer-joined rows from TABLE1, which contain nulls for the restricted column, are eliminated. See the Examples section below. 

## Examples
<a name="r_WHERE_oracle_outer-examples"></a>

The following join query specifies a left outer join of the SALES and LISTING tables over their LISTID columns: 

```
select count(*)
from sales, listing
where sales.listid = listing.listid(+);

count
--------
172456
(1 row)
```

The following equivalent query produces the same result but uses FROM clause join syntax: 

```
select count(*)
from sales left outer join listing on sales.listid = listing.listid;

count
--------
172456
(1 row)
```

The SALES table doesn't contain records for all listings in the LISTING table because not all listings result in sales. The following query outer-joins SALES and LISTING and returns rows from LISTING even when the SALES table reports no sales for a given list ID. The PRICE and COMM columns, derived from the SALES table, contain nulls in the result set for those non-matching rows. 

```
select listing.listid, sum(pricepaid) as price,
sum(commission) as comm
from listing, sales
where sales.listid(+) = listing.listid and listing.listid between 1 and 5
group by 1 order by 1;

listid | price  |  comm
--------+--------+--------
1 | 728.00 | 109.20
2 |        |
3 |        |
4 |  76.00 |  11.40
5 | 525.00 |  78.75
(5 rows)
```

Note that when the WHERE clause join operator is used, the order of the tables in the FROM clause doesn't matter. 

An example of a more complex outer join condition in the WHERE clause is the case where the condition consists of a comparison between two table columns *and* a comparison with a constant: 

```
where category.catid=event.catid(+) and eventid(+)=796;
```

Note that the (\$1) operator is used in two places: first in the equality comparison between the tables and second in the comparison condition for the EVENTID column. The result of this syntax is the preservation of the outer-joined rows when the restriction on EVENTID is evaluated. If you remove the (\$1) operator from the EVENTID restriction, the query treats this restriction as a filter, not as part of the outer-join condition. In turn, the outer-joined rows that contain nulls for EVENTID are eliminated from the result set. 

Here is a complete query that illustrates this behavior: 

```
select catname, catgroup, eventid
from category, event
where category.catid=event.catid(+) and eventid(+)=796;

catname | catgroup | eventid
-----------+----------+---------
Classical | Concerts |
Jazz | Concerts |
MLB | Sports   |
MLS | Sports   |
Musicals | Shows    | 796
NBA | Sports   |
NFL | Sports   |
NHL | Sports   |
Opera | Shows    |
Plays | Shows    |
Pop | Concerts |
(11 rows)
```

The equivalent query using FROM clause syntax is as follows: 

```
select catname, catgroup, eventid
from category left join event
on category.catid=event.catid and eventid=796;
```

If you remove the second (\$1) operator from the WHERE clause version of this query, it returns only 1 row (the row where `eventid=796`). 

```
select catname, catgroup, eventid
from category, event
where category.catid=event.catid(+) and eventid=796;

catname | catgroup | eventid
-----------+----------+---------
Musicals | Shows    | 796
(1 row)
```

# GROUP BY clause
<a name="r_GROUP_BY_clause"></a>

The GROUP BY clause identifies the grouping columns for the query. It is used to group together those rows in a table that have the same values in all the columns listed. The order in which the columns are listed does not matter. The outcome is to combine each set of rows having common values into one group row that represents all rows in the group. Use a GROUP BY to eliminate redundancy in the output and to compute aggregates that apply to the groups. Grouping columns must be declared when the query computes aggregates with standard functions such as SUM, AVG, and COUNT. For more information, see [Aggregate functions](c_Aggregate_Functions.md).

## Syntax
<a name="r_GROUP_BY_clause-syntax"></a>

```
[ GROUP BY  expression [, ...] | ALL | aggregation_extension  ]
```

where *aggregation\$1extension* is one of the following:

```
GROUPING SETS ( () | aggregation_extension [, ...] ) |
ROLLUP ( expr [, ...] ) |
CUBE ( expr [, ...] )
```

## Parameters
<a name="r_GROUP_BY_clause-parameters"></a>

 *expression*  
The list of columns or expressions must match the list of non-aggregate expressions in the select list of the query. For example, consider the following simple query.  

```
select listid, eventid, sum(pricepaid) as revenue,
count(qtysold) as numtix
from sales
group by listid, eventid
order by 3, 4, 2, 1
limit 5;

listid | eventid | revenue | numtix
-------+---------+---------+--------
89397  |      47 |   20.00 |      1
106590 |      76 |   20.00 |      1
124683 |     393 |   20.00 |      1
103037 |     403 |   20.00 |      1
147685 |     429 |   20.00 |      1
(5 rows)
```
In this query, the select list consists of two aggregate expressions. The first uses the SUM function and the second uses the COUNT function. The remaining two columns, LISTID and EVENTID, must be declared as grouping columns.  
Expressions in the GROUP BY clause can also reference the select list by using ordinal numbers. For example, the previous example could be abbreviated as follows.  

```
select listid, eventid, sum(pricepaid) as revenue,
count(qtysold) as numtix
from sales
group by 1,2
order by 3, 4, 2, 1
limit 5;

listid | eventid | revenue | numtix
-------+---------+---------+--------
89397  |      47 |   20.00 |      1
106590 |      76 |   20.00 |      1
124683 |     393 |   20.00 |      1
103037 |     403 |   20.00 |      1
147685 |     429 |   20.00 |      1
(5 rows)
```

ALL  
ALL indicates to group by all columns specified in the SELECT list except those that are aggregated. For example, consider the following query which groups by `col1` and `col2` without having to specify them individually in the GROUP BY clause. The column `col3` is the argument of the `SUM` function and thus not grouped.  

```
SELECT col1, col2 sum(col3) FROM testtable GROUP BY ALL
```
If you EXCLUDE a column in the SELECT list, the GROUP BY ALL clause does not group the results based on that specific column.  

```
SELECT * EXCLUDE col3 FROM testtable GROUP BY ALL
```

 * *aggregation\$1extension* *   
You can use the aggregation extensions GROUPING SETS, ROLLUP, and CUBE to perform the work of multiple GROUP BY operations in a single statement. For more information on aggregation extensions and related functions, see [Aggregation extensions](r_GROUP_BY_aggregation-extensions.md). 

## Examples
<a name="r_GROUP_BY_clause-examples"></a>

The following examples use the SALES table that contains columns: salesid, listid, sellerid, buyerid, eventid, dateid, qtysold, pricepaid, commission, and saletime. For more information about the SALES table, see [Sample database](c_sampledb.md).

The following example query groups by `salesid` and `listid` without having to specify them individually in the GROUP BY clause. The column `qtysold` is the argument of the `SUM` function and thus not grouped.

```
SELECT salesid, listid, sum(qtysold) FROM sales GROUP BY ALL;

salesid | listid  | sum
--------+---------+------
33095   | 36572   | 2	
88268   | 100813  | 4	
110917  | 127048  | 1	
...
```

The following example query excludes several columns in the SELECT list, so GROUP BY ALL only groups salesid and listid.

```
SELECT * EXCLUDE sellerid, buyerid, eventid, dateid, qtysold, pricepaid, commission, saletime 
FROM sales GROUP BY ALL;

salesid | listid 
--------+---------
33095   | 36572   	
88268   | 100813 	
110917  | 127048 	
...
```

# Aggregation extensions
<a name="r_GROUP_BY_aggregation-extensions"></a>

Amazon Redshift supports aggregation extensions to do the work of multiple GROUP BY operations in a single statement.

 The examples for aggregation extensions use the `orders` table, which holds sales data for an electronics company. You can create `orders` with the following.

```
CREATE TABLE ORDERS (
    ID INT,
    PRODUCT CHAR(20),
    CATEGORY CHAR(20),
    PRE_OWNED CHAR(1),
    COST DECIMAL
);

INSERT INTO ORDERS VALUES
    (0, 'laptop',       'computers',    'T', 1000),
    (1, 'smartphone',   'cellphones',   'T', 800),
    (2, 'smartphone',   'cellphones',   'T', 810),
    (3, 'laptop',       'computers',    'F', 1050),
    (4, 'mouse',        'computers',    'F', 50);
```

## *GROUPING SETS*
<a name="r_GROUP_BY_aggregation-extensions-grouping-sets"></a>

 Computes one or more grouping sets in a single statement. A grouping set is the set of a single GROUP BY clause, a set of 0 or more columns by which you can group a query's result set. GROUP BY GROUPING SETS is equivalent to running a UNION ALL query on one result set grouped by different columns. For example, GROUP BY GROUPING SETS((a), (b)) is equivalent to GROUP BY a UNION ALL GROUP BY b. 

 The following example returns the cost of the order table's products grouped according to both the products' categories and the kind of products sold. 

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY GROUPING SETS(category, product);

       category       |       product        | total
----------------------+----------------------+-------
 computers            |                      |  2100
 cellphones           |                      |  1610
                      | laptop               |  2050
                      | smartphone           |  1610
                      | mouse                |    50

(5 rows)
```

## *ROLLUP*
<a name="r_GROUP_BY_aggregation-extensions-rollup"></a>

 Assumes a hierarchy where preceding columns are considered the parents of subsequent columns. ROLLUP groups data by the provided columns, returning extra subtotal rows representing the totals throughout all levels of grouping columns, in addition to the grouped rows. For example, you can use GROUP BY ROLLUP((a), (b)) to return a result set grouped first by a, then by b while assuming that b is a subsection of a. ROLLUP also returns a row with the whole result set without grouping columns. 

GROUP BY ROLLUP((a), (b)) is equivalent to GROUP BY GROUPING SETS((a,b), (a), ()). 

The following example returns the cost of the order table's products grouped first by category and then product, with product as a subdivision of category.

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY ROLLUP(category, product) ORDER BY 1,2;

       category       |       product        | total
----------------------+----------------------+-------
 cellphones           | smartphone           |  1610
 cellphones           |                      |  1610
 computers            | laptop               |  2050
 computers            | mouse                |    50
 computers            |                      |  2100
                      |                      |  3710
(6 rows)
```

## *CUBE*
<a name="r_GROUP_BY_aggregation-extensions-cube"></a>

 Groups data by the provided columns, returning extra subtotal rows representing the totals throughout all levels of grouping columns, in addition to the grouped rows. CUBE returns the same rows as ROLLUP, while adding additional subtotal rows for every combination of grouping column not covered by ROLLUP. For example, you can use GROUP BY CUBE ((a), (b)) to return a result set grouped first by a, then by b while assuming that b is a subsection of a, then by b alone. CUBE also returns a row with the whole result set without grouping columns.

GROUP BY CUBE((a), (b)) is equivalent to GROUP BY GROUPING SETS((a, b), (a), (b), ()). 

The following example returns the cost of the order table's products grouped first by category and then product, with product as a subdivision of category. Unlike the preceding example for ROLLUP, the statement returns results for every combination of grouping column. 

```
SELECT category, product, sum(cost) as total
FROM orders
GROUP BY CUBE(category, product) ORDER BY 1,2;

       category       |       product        | total
----------------------+----------------------+-------
 cellphones           | smartphone           |  1610
 cellphones           |                      |  1610
 computers            | laptop               |  2050
 computers            | mouse                |    50
 computers            |                      |  2100
                      | laptop               |  2050
                      | mouse                |    50
                      | smartphone           |  1610
                      |                      |  3710
(9 rows)
```

## *GROUPING/GROUPING\$1ID functions*
<a name="r_GROUP_BY_aggregation-extentions-grouping"></a>

 ROLLUP and CUBE add NULL values to the result set to indicate subtotal rows. For example, GROUP BY ROLLUP((a), (b)) returns one or more rows that have a value of NULL in the b grouping column to indicate they are subtotals of fields in the a grouping column. These NULL values serve only to satisfy the format of returning tuples.

 When you run GROUP BY operations with ROLLUP and CUBE on relations that store NULL values themselves, this can produce result sets with rows that appear to have identical grouping columns. Returning to the previous example, if the b grouping column contains a stored NULL value, GROUP BY ROLLUP((a), (b)) returns a row with a value of NULL in the b grouping column that isn't a subtotal. 

 To distinguish between NULL values created by ROLLUP and CUBE, and the NULL values stored in the tables themselves, you can use the GROUPING function, or its alias GROUPING\$1ID. GROUPING takes a single grouping set as its argument, and for each row in the result set returns a 0 or 1 bit value corresponding to the grouping column in that position, and then converts that value into an integer. If the value in that position is a NULL value created by an aggregation extension, GROUPING returns 1. It returns 0 for all other values, including stored NULL values.

 For example, GROUPING(category, product) can return the following values for a given row, depending on the grouping column values for that row. For the purposes of this example, all NULL values in the table are NULL values created by an aggregation extension.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_GROUP_BY_aggregation-extensions.html)

GROUPING functions appear in the SELECT list portion of the query in the following format.

```
SELECT ... [GROUPING( expr )...] ...
  GROUP BY ... {CUBE | ROLLUP| GROUPING SETS} ( expr ) ...
```

The following example is the same as the preceding example for CUBE, but with the addition of GROUPING functions for its grouping sets.

```
SELECT category, product,
       GROUPING(category) as grouping0,
       GROUPING(product) as grouping1,
       GROUPING(category, product) as grouping2,
       sum(cost) as total
FROM orders
GROUP BY CUBE(category, product) ORDER BY 3,1,2;

       category       |       product        | grouping0 | grouping1 | grouping2 | total
----------------------+----------------------+-----------+-----------+-----------+-------
 cellphones           | smartphone           |         0 |         0 |         0 |  1610
 cellphones           |                      |         0 |         1 |         1 |  1610
 computers            | laptop               |         0 |         0 |         0 |  2050
 computers            | mouse                |         0 |         0 |         0 |    50
 computers            |                      |         0 |         1 |         1 |  2100
                      | laptop               |         1 |         0 |         2 |  2050
                      | mouse                |         1 |         0 |         2 |    50
                      | smartphone           |         1 |         0 |         2 |  1610
                      |                      |         1 |         1 |         3 |  3710
(9 rows)
```

## *Partial ROLLUP and CUBE*
<a name="r_GROUP_BY_aggregation-extentions-partial"></a>

 You can run ROLLUP and CUBE operations with only a portion of the subtotals. 

 The syntax for partial ROLLUP and CUBE operations is as follows.

```
GROUP BY expr1, { ROLLUP | CUBE }(expr2, [, ...])
```

Here, the GROUP BY clause only creates subtotal rows at the level of *expr2* and onwards.

The following examples show partial ROLLUP and CUBE operations on the orders table, grouping first by whether a product is pre-owned and then running ROLLUP and CUBE on the category and product columns.

```
SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY pre_owned, ROLLUP(category, product) ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
(9 rows)

SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY pre_owned, CUBE(category, product) ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
 F         |                      | laptop               |        4 |  1050
 F         |                      | mouse                |        4 |    50
 T         |                      | laptop               |        4 |  1000
 T         |                      | smartphone           |        4 |  1610
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
(13 rows)
```

Since the pre-owned column isn't included in the ROLLUP and CUBE operations, there's no grand total row that includes all other rows. 

## *Concatenated grouping*
<a name="r_GROUP_BY_aggregation-extentions-concat"></a>

 You can concatenate multiple GROUPING SETS/ROLLUP/CUBE clauses to calculate different levels of subtotals. Concatenated groupings return the Cartesian product of the provided grouping sets. 

 The syntax for concatenating GROUPING SETS/ROLLUP/CUBE clauses is as follows.

```
GROUP BY {ROLLUP|CUBE|GROUPING SETS}(expr1[, ...]),
         {ROLLUP|CUBE|GROUPING SETS}(expr1[, ...])[, ...]
```

Consider the following example to see how a small concatenated grouping can produce a large final result set.

```
SELECT pre_owned, category, product,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY CUBE(category, product), GROUPING SETS(pre_owned, ())
ORDER BY 4,1,2,3;

 pre_owned |       category       |       product        | group_id | total
-----------+----------------------+----------------------+----------+-------
 F         | computers            | laptop               |        0 |  1050
 F         | computers            | mouse                |        0 |    50
 T         | cellphones           | smartphone           |        0 |  1610
 T         | computers            | laptop               |        0 |  1000
           | cellphones           | smartphone           |        1 |  1610
           | computers            | laptop               |        1 |  2050
           | computers            | mouse                |        1 |    50
 F         | computers            |                      |        2 |  1100
 T         | cellphones           |                      |        2 |  1610
 T         | computers            |                      |        2 |  1000
           | cellphones           |                      |        3 |  1610
           | computers            |                      |        3 |  2100
 F         |                      | laptop               |        4 |  1050
 F         |                      | mouse                |        4 |    50
 T         |                      | laptop               |        4 |  1000
 T         |                      | smartphone           |        4 |  1610
           |                      | laptop               |        5 |  2050
           |                      | mouse                |        5 |    50
           |                      | smartphone           |        5 |  1610
 F         |                      |                      |        6 |  1100
 T         |                      |                      |        6 |  2610
           |                      |                      |        7 |  3710
(22 rows)
```

## *Nested grouping*
<a name="r_GROUP_BY_aggregation-extentions-nested"></a>

 You can use GROUPING SETS/ROLLUP/CUBE operations as your GROUPING SETS *expr* to form a nested grouping. The sub grouping inside nested GROUPING SETS is flattened. 

 The syntax for nested grouping is as follows.

```
GROUP BY GROUPING SETS({ROLLUP|CUBE|GROUPING SETS}(expr[, ...])[, ...])
```

Consider the following example.

```
SELECT category, product, pre_owned,
       GROUPING(category, product, pre_owned) as group_id,
       sum(cost) as total
FROM orders
GROUP BY GROUPING SETS(ROLLUP(category), CUBE(product, pre_owned))
ORDER BY 4,1,2,3;

       category       |       product        | pre_owned | group_id | total
----------------------+----------------------+-----------+----------+-------
 cellphones           |                      |           |        3 |  1610
 computers            |                      |           |        3 |  2100
                      | laptop               | F         |        4 |  1050
                      | laptop               | T         |        4 |  1000
                      | mouse                | F         |        4 |    50
                      | smartphone           | T         |        4 |  1610
                      | laptop               |           |        5 |  2050
                      | mouse                |           |        5 |    50
                      | smartphone           |           |        5 |  1610
                      |                      | F         |        6 |  1100
                      |                      | T         |        6 |  2610
                      |                      |           |        7 |  3710
                      |                      |           |        7 |  3710
(13 rows)
```

Note that because both ROLLUP(category) and CUBE(product, pre\$1owned) contain the grouping set (), the row representing the grand total is duplicated.

## *Usage notes*
<a name="r_GROUP_BY_aggregation-extensions-usage-notes"></a>
+ The GROUP BY clause supports up to 64 grouping sets. In the case of ROLLUP and CUBE, or some combination of GROUPING SETS, ROLLUP, and CUBE, this limitation applies to the implied number of grouping sets. For example, GROUP BY CUBE((a), (b)) counts as 4 grouping sets, not 2.
+ You can't use constants as grouping columns when using aggregation extensions.
+ You can't make a grouping set that contains duplicate columns.

# HAVING clause
<a name="r_HAVING_clause"></a>

The HAVING clause applies a condition to the intermediate grouped result set that a query returns.

## Syntax
<a name="r_HAVING_clause-synopsis"></a>

```
[ HAVING condition ]
```

For example, you can restrict the results of a SUM function:

```
having sum(pricepaid) >10000
```

The HAVING condition is applied after all WHERE clause conditions are applied and GROUP BY operations are completed.

The condition itself takes the same form as any WHERE clause condition.

## Usage notes
<a name="r_HAVING_clause_usage_notes"></a>
+ Any column that is referenced in a HAVING clause condition must be either a grouping column or a column that refers to the result of an aggregate function.
+ In a HAVING clause, you can't specify:
  + An ordinal number that refers to a select list item. Only the GROUP BY and ORDER BY clauses accept ordinal numbers.

## Examples
<a name="r_HAVING_clause-examples"></a>

The following query calculates total ticket sales for all events by name, then eliminates events where the total sales were less than \$1800,000. The HAVING condition is applied to the results of the aggregate function in the select list: `sum(pricepaid)`.

```
select eventname, sum(pricepaid)
from sales join event on sales.eventid = event.eventid
group by 1
having sum(pricepaid) > 800000
order by 2 desc, 1;

eventname        |    sum
-----------------+-----------
Mamma Mia!       | 1135454.00
Spring Awakening |  972855.00
The Country Girl |  910563.00
Macbeth          |  862580.00
Jersey Boys      |  811877.00
Legally Blonde   |  804583.00
```

The following query calculates a similar result set. In this case, however, the HAVING condition is applied to an aggregate that isn't specified in the select list: `sum(qtysold)`. Events that did not sell more than 2,000 tickets are eliminated from the final result.

```
select eventname, sum(pricepaid)
from sales join event on sales.eventid = event.eventid
group by 1
having sum(qtysold) >2000
order by 2 desc, 1;

eventname        |    sum
-----------------+-----------
Mamma Mia!       | 1135454.00
Spring Awakening |  972855.00
The Country Girl |  910563.00
Macbeth          |  862580.00
Jersey Boys      |  811877.00
Legally Blonde   |  804583.00
Chicago          |  790993.00
Spamalot         |  714307.00
```

The following query calculates total ticket sales for all events by name, then eliminates events where the total sales were less than \$1800,000. The HAVING condition is applied to the results of the aggregate function in the select list using the alias `pp` for `sum(pricepaid)`.

```
select eventname, sum(pricepaid) as pp
from sales join event on sales.eventid = event.eventid
group by 1
having pp > 800000
order by 2 desc, 1;

eventname        |    pp
-----------------+-----------
Mamma Mia!       | 1135454.00
Spring Awakening |  972855.00
The Country Girl |  910563.00
Macbeth          |  862580.00
Jersey Boys      |  811877.00
Legally Blonde   |  804583.00
```

# QUALIFY clause
<a name="r_QUALIFY_clause"></a>

The QUALIFY clause filters results of a previously computed window function according to user‑specified search conditions. You can use the clause to apply filtering conditions to the result of a window function without using a subquery.

It is similar to the [HAVING clause](https://docs.aws.amazon.com/redshift/latest/dg/r_HAVING_clause.html), which applies a condition to further filter rows from a WHERE clause. The difference between QUALIFY and HAVING is that filtered results from the QUALIFY clause could be based on the result of running window functions on the data. You can use both the QUALIFY and HAVING clauses in one query.

## Syntax
<a name="r_QUALIFY-synopsis"></a>

```
QUALIFY condition
```

**Note**  
If you're using the QUALIFY clause directly after the FROM clause, the FROM relation name must have an alias specified before the QUALIFY clause.

## Examples
<a name="r_QUALIFY-examples"></a>

The examples in this section use the sample data below.

```
create table store_sales (ss_sold_date date, ss_sold_time time, 
               ss_item text, ss_sales_price float);
insert into store_sales values ('2022-01-01', '09:00:00', 'Product 1', 100.0),
                               ('2022-01-01', '11:00:00', 'Product 2', 500.0),
                               ('2022-01-01', '15:00:00', 'Product 3', 20.0),
                               ('2022-01-01', '17:00:00', 'Product 4', 1000.0),
                               ('2022-01-01', '18:00:00', 'Product 5', 30.0),
                               ('2022-01-02', '10:00:00', 'Product 6', 5000.0),
                               ('2022-01-02', '16:00:00', 'Product 7', 5.0);
```

The following example demonstrates how to find the two most expensive items sold after 12:00 each day.

```
SELECT *
FROM store_sales ss
WHERE ss_sold_time > time '12:00:00'
QUALIFY row_number()
OVER (PARTITION BY ss_sold_date ORDER BY ss_sales_price DESC) <= 2
               

 ss_sold_date | ss_sold_time |  ss_item  | ss_sales_price 
--------------+--------------+-----------+----------------
 2022-01-01   | 17:00:00     | Product 4 |           1000
 2022-01-01   | 18:00:00     | Product 5 |             30
 2022-01-02   | 16:00:00     | Product 7 |              5
```

You can then find the last item sold each day.

```
SELECT *
FROM store_sales ss
QUALIFY last_value(ss_item)
OVER (PARTITION BY ss_sold_date ORDER BY ss_sold_time ASC
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) = ss_item;
               
ss_sold_date | ss_sold_time |  ss_item  | ss_sales_price 
--------------+--------------+-----------+----------------
 2022-01-01   | 18:00:00     | Product 5 |             30
 2022-01-02   | 16:00:00     | Product 7 |              5
```

The following example returns the same records as the previous query, the last item sold each day, but it doesn't use the QUALIFY clause.

```
SELECT * FROM (
  SELECT *,
  last_value(ss_item)
  OVER (PARTITION BY ss_sold_date ORDER BY ss_sold_time ASC
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) ss_last_item
  FROM store_sales ss
)
WHERE ss_last_item = ss_item;
               
 ss_sold_date | ss_sold_time |  ss_item  | ss_sales_price | ss_last_item 
--------------+--------------+-----------+----------------+--------------
 2022-01-02   | 16:00:00     | Product 7 |              5 | Product 7
 2022-01-01   | 18:00:00     | Product 5 |             30 | Product 5
```

# UNION, INTERSECT, and EXCEPT
<a name="r_UNION"></a>

**Topics**
+ [Syntax](#r_UNION-synopsis)
+ [Parameters](#r_UNION-parameters)
+ [Order of evaluation for set operators](#r_UNION-order-of-evaluation-for-set-operators)
+ [Usage notes](#r_UNION-usage-notes)
+ [Example UNION queries](c_example_union_query.md)
+ [Example UNION ALL query](c_example_unionall_query.md)
+ [Example INTERSECT queries](c_example_intersect_query.md)
+ [Example EXCEPT query](c_Example_MINUS_query.md)

The UNION, INTERSECT, and EXCEPT *set operators* are used to compare and merge the results of two separate query expressions. For example, if you want to know which users of a website are both buyers and sellers but their user names are stored in separate columns or tables, you can find the *intersection* of these two types of users. If you want to know which website users are buyers but not sellers, you can use the EXCEPT operator to find the *difference* between the two lists of users. If you want to build a list of all users, regardless of role, you can use the UNION operator.

## Syntax
<a name="r_UNION-synopsis"></a>

```
query
{ UNION [ ALL ] | INTERSECT | EXCEPT | MINUS }
query
```

## Parameters
<a name="r_UNION-parameters"></a>

 *query*   
A query expression that corresponds, in the form of its select list, to a second query expression that follows the UNION, INTERSECT, or EXCEPT operator. The two expressions must contain the same number of output columns with compatible data types; otherwise, the two result sets can't be compared and merged. Set operations don't allow implicit conversion between different categories of data types; for more information, see [Type compatibility and conversion](c_Supported_data_types.md#r_Type_conversion).  
You can build queries that contain an unlimited number of query expressions and link them with UNION, INTERSECT, and EXCEPT operators in any combination. For example, the following query structure is valid, assuming that the tables T1, T2, and T3 contain compatible sets of columns:   

```
select * from t1
union
select * from t2
except
select * from t3
order by c1;
```

UNION   
Set operation that returns rows from two query expressions, regardless of whether the rows derive from one or both expressions.

INTERSECT   
Set operation that returns rows that derive from two query expressions. Rows that aren't returned by both expressions are discarded.

EXCEPT \$1 MINUS   
Set operation that returns rows that derive from one of two query expressions. To qualify for the result, rows must exist in the first result table but not the second. MINUS and EXCEPT are exact synonyms. 

ALL   
The ALL keyword retains any duplicate rows that are produced by UNION. The default behavior when the ALL keyword isn't used is to discard these duplicates. INTERSECT ALL, EXCEPT ALL, and MINUS ALL aren't supported.

## Order of evaluation for set operators
<a name="r_UNION-order-of-evaluation-for-set-operators"></a>

The UNION and EXCEPT set operators are left-associative. If parentheses aren't specified to influence the order of precedence, a combination of these set operators is evaluated from left to right. For example, in the following query, the UNION of T1 and T2 is evaluated first, then the EXCEPT operation is performed on the UNION result: 

```
select * from t1
union
select * from t2
except
select * from t3
order by c1;
```

The INTERSECT operator takes precedence over the UNION and EXCEPT operators when a combination of operators is used in the same query. For example, the following query evaluates the intersection of T2 and T3, then union the result with T1: 

```
select * from t1
union
select * from t2
intersect
select * from t3
order by c1;
```

By adding parentheses, you can enforce a different order of evaluation. In the following case, the result of the union of T1 and T2 is intersected with T3, and the query is likely to produce a different result. 

```
(select * from t1
union
select * from t2)
intersect
(select * from t3)
order by c1;
```

## Usage notes
<a name="r_UNION-usage-notes"></a>
+ The column names returned in the result of a set operation query are the column names (or aliases) from the tables in the first query expression. Because these column names are potentially misleading, in that the values in the column derive from tables on either side of the set operator, you might want to provide meaningful aliases for the result set.
+ A query expression that precedes a set operator should not contain an ORDER BY clause. An ORDER BY clause produces meaningful sorted results only when it is used at the end of a query that contains set operators. In this case, the ORDER BY clause applies to the final results of all of the set operations. The outermost query can also contain standard LIMIT and OFFSET clauses. 
+ When set operator queries return decimal results, the corresponding result columns are promoted to return the same precision and scale. For example, in the following query, where T1.REVENUE is a DECIMAL(10,2) column and T2.REVENUE is a DECIMAL(8,4) column, the decimal result is promoted to DECIMAL(12,4): 

  ```
  select t1.revenue union select t2.revenue;
  ```

  The scale is `4` because that is the maximum scale of the two columns. The precision is `12` because T1.REVENUE requires 8 digits to the left of the decimal point (12 - 4 = 8). This type promotion ensures that all values from both sides of the UNION fit in the result. For 64-bit values, the maximum result precision is 19 and the maximum result scale is 18. For 128-bit values, the maximum result precision is 38 and the maximum result scale is 37.

  If the resulting data type exceeds Amazon Redshift precision and scale limits, the query returns an error.
+ For set operations, two rows are treated as identical if, for each corresponding pair of columns, the two data values are either *equal* or *both NULL*. For example, if tables T1 and T2 both contain one column and one row, and that row is NULL in both tables, an INTERSECT operation over those tables returns that row.

# Example UNION queries
<a name="c_example_union_query"></a>

In the following UNION query, rows in the SALES table are merged with rows in the LISTING table. Three compatible columns are selected from each table; in this case, the corresponding columns have the same names and data types. 

The final result set is ordered by the first column in the LISTING table and limited to the 5 rows with the highest LISTID value. 

```
select listid, sellerid, eventid from listing
union select listid, sellerid, eventid from sales
order by listid, sellerid, eventid desc limit 5;

listid | sellerid | eventid
--------+----------+---------
1 |    36861 |    7872
2 |    16002 |    4806
3 |    21461 |    4256
4 |     8117 |    4337
5 |     1616 |    8647
(5 rows)
```

The following example shows how you can add a literal value to the output of a UNION query so you can see which query expression produced each row in the result set. The query identifies rows from the first query expression as "B" (for buyers) and rows from the second query expression as "S" (for sellers). 

The query identifies buyers and sellers for ticket transactions that cost \$110,000 or more. The only difference between the two query expressions on either side of the UNION operator is the joining column for the SALES table. 

```
select listid, lastname, firstname, username,
pricepaid as price, 'S' as buyorsell
from sales, users
where sales.sellerid=users.userid
and pricepaid >=10000
union
select listid, lastname, firstname, username, pricepaid,
'B' as buyorsell
from sales, users
where sales.buyerid=users.userid
and pricepaid >=10000
order by 1, 2, 3, 4, 5;

listid | lastname | firstname | username |   price   | buyorsell
--------+----------+-----------+----------+-----------+-----------
209658 | Lamb     | Colette   | VOR15LYI |  10000.00 | B
209658 | West     | Kato      | ELU81XAA |  10000.00 | S
212395 | Greer    | Harlan    | GXO71KOC |  12624.00 | S
212395 | Perry    | Cora      | YWR73YNZ |  12624.00 | B
215156 | Banks    | Patrick   | ZNQ69CLT |  10000.00 | S
215156 | Hayden   | Malachi   | BBG56AKU |  10000.00 | B
(6 rows)
```

The following example uses a UNION ALL operator because duplicate rows, if found, need to be retained in the result. For a specific series of event IDs, the query returns 0 or more rows for each sale associated with each event, and 0 or 1 row for each listing of that event. Event IDs are unique to each row in the LISTING and EVENT tables, but there might be multiple sales for the same combination of event and listing IDs in the SALES table. 

The third column in the result set identifies the source of the row. If it comes from the SALES table, it is marked "Yes" in the SALESROW column. (SALESROW is an alias for SALES.LISTID.) If the row comes from the LISTING table, it is marked "No" in the SALESROW column. 

In this case, the result set consists of three sales rows for listing 500, event 7787. In other words, three different transactions took place for this listing and event combination. The other two listings, 501 and 502, did not produce any sales, so the only row that the query produces for these list IDs comes from the LISTING table (SALESROW = 'No'). 

```
select eventid, listid, 'Yes' as salesrow
from sales
where listid in(500,501,502)
union all
select eventid, listid, 'No'
from listing
where listid in(500,501,502)
order by listid asc;

eventid | listid | salesrow
---------+--------+----------
7787 |    500 | No
7787 |    500 | Yes
7787 |    500 | Yes
7787 |    500 | Yes
6473 |    501 | No
5108 |    502 | No
(6 rows)
```

If you run the same query without the ALL keyword, the result retains only one of the sales transactions. 

```
select eventid, listid, 'Yes' as salesrow
from sales
where listid in(500,501,502)
union
select eventid, listid, 'No'
from listing
where listid in(500,501,502)
order by listid asc;

eventid | listid | salesrow
---------+--------+----------
7787 |    500 | No
7787 |    500 | Yes
6473 |    501 | No
5108 |    502 | No
(4 rows)
```

# Example UNION ALL query
<a name="c_example_unionall_query"></a>

The following example uses a UNION ALL operator because duplicate rows, if found, need to be retained in the result. For a specific series of event IDs, the query returns 0 or more rows for each sale associated with each event, and 0 or 1 row for each listing of that event. Event IDs are unique to each row in the LISTING and EVENT tables, but there might be multiple sales for the same combination of event and listing IDs in the SALES table.

The third column in the result set identifies the source of the row. If it comes from the SALES table, it is marked "Yes" in the SALESROW column. (SALESROW is an alias for SALES.LISTID.) If the row comes from the LISTING table, it is marked "No" in the SALESROW column.

In this case, the result set consists of three sales rows for listing 500, event 7787. In other words, three different transactions took place for this listing and event combination. The other two listings, 501 and 502, did not produce any sales, so the only row that the query produces for these list IDs comes from the LISTING table (SALESROW = 'No').

```
select eventid, listid, 'Yes' as salesrow
from sales
where listid in(500,501,502)
union all
select eventid, listid, 'No'
from listing
where listid in(500,501,502)
order by listid asc;

eventid | listid | salesrow
---------+--------+----------
7787 |    500 | No
7787 |    500 | Yes
7787 |    500 | Yes
7787 |    500 | Yes
6473 |    501 | No
5108 |    502 | No
(6 rows)
```

If you run the same query without the ALL keyword, the result retains only one of the sales transactions. 

```
select eventid, listid, 'Yes' as salesrow
from sales
where listid in(500,501,502)
union
select eventid, listid, 'No'
from listing
where listid in(500,501,502)
order by listid asc;

eventid | listid | salesrow
---------+--------+----------
7787 |    500 | No
7787 |    500 | Yes
6473 |    501 | No
5108 |    502 | No
(4 rows)
```

# Example INTERSECT queries
<a name="c_example_intersect_query"></a>

Compare the following example with the first UNION example. The only difference between the two examples is the set operator that is used, but the results are very different. Only one of the rows is the same: 

```
235494 |    23875 |    8771
```

 This is the only row in the limited result of 5 rows that was found in both tables.

```
select listid, sellerid, eventid from listing
intersect
select listid, sellerid, eventid from sales
order by listid desc, sellerid, eventid
limit 5;

listid | sellerid | eventid
--------+----------+---------
235494 |    23875 |    8771
235482 |     1067 |    2667
235479 |     1589 |    7303
235476 |    15550 |     793
235475 |    22306 |    7848
(5 rows)
```

The following query finds events (for which tickets were sold) that occurred at venues in both New York City and Los Angeles in March. The difference between the two query expressions is the constraint on the VENUECITY column.

```
select distinct eventname from event, sales, venue
where event.eventid=sales.eventid and event.venueid=venue.venueid
and date_part(month,starttime)=3 and venuecity='Los Angeles'
intersect
select distinct eventname from event, sales, venue
where event.eventid=sales.eventid and event.venueid=venue.venueid
and date_part(month,starttime)=3 and venuecity='New York City'
order by eventname asc;

eventname
----------------------------
A Streetcar Named Desire
Dirty Dancing
Electra
Running with Annalise
Hairspray
Mary Poppins
November
Oliver!
Return To Forever
Rhinoceros
South Pacific
The 39 Steps
The Bacchae
The Caucasian Chalk Circle
The Country Girl
Wicked
Woyzeck
(16 rows)
```

# Example EXCEPT query
<a name="c_Example_MINUS_query"></a>

The CATEGORY table in the TICKIT database contains the following 11 rows: 

```
 catid | catgroup |  catname  |                  catdesc
-------+----------+-----------+--------------------------------------------
   1   | Sports   | MLB       | Major League Baseball
   2   | Sports   | NHL       | National Hockey League
   3   | Sports   | NFL       | National Football League
   4   | Sports   | NBA       | National Basketball Association
   5   | Sports   | MLS       | Major League Soccer
   6   | Shows    | Musicals  | Musical theatre
   7   | Shows    | Plays     | All non-musical theatre
   8   | Shows    | Opera     | All opera and light opera
   9   | Concerts | Pop       | All rock and pop music concerts
  10   | Concerts | Jazz      | All jazz singers and bands
  11   | Concerts | Classical | All symphony, concerto, and choir concerts
(11 rows)
```

Assume that a CATEGORY\$1STAGE table (a staging table) contains one additional row: 

```
 catid | catgroup |  catname  |                  catdesc
-------+----------+-----------+--------------------------------------------
1 | Sports   | MLB       | Major League Baseball
2 | Sports   | NHL       | National Hockey League
3 | Sports   | NFL       | National Football League
4 | Sports   | NBA       | National Basketball Association
5 | Sports   | MLS       | Major League Soccer
6 | Shows    | Musicals  | Musical theatre
7 | Shows    | Plays     | All non-musical theatre
8 | Shows    | Opera     | All opera and light opera
9 | Concerts | Pop       | All rock and pop music concerts
10 | Concerts | Jazz      | All jazz singers and bands
11 | Concerts | Classical | All symphony, concerto, and choir concerts
12 | Concerts | Comedy    | All stand up comedy performances
(12 rows)
```

Return the difference between the two tables. In other words, return rows that are in the CATEGORY\$1STAGE table but not in the CATEGORY table: 

```
select * from category_stage
except
select * from category;

catid | catgroup | catname |             catdesc
-------+----------+---------+----------------------------------
12 | Concerts | Comedy  | All stand up comedy performances
(1 row)
```

The following equivalent query uses the synonym MINUS. 

```
select * from category_stage
minus
select * from category;

catid | catgroup | catname |             catdesc
-------+----------+---------+----------------------------------
12 | Concerts | Comedy  | All stand up comedy performances
(1 row)
```

If you reverse the order of the SELECT expressions, the query returns no rows. 

# ORDER BY clause
<a name="r_ORDER_BY_clause"></a>

**Topics**
+ [Syntax](#r_ORDER_BY_clause-synopsis)
+ [Parameters](#r_ORDER_BY_clause-parameters)
+ [Usage notes](#r_ORDER_BY_usage_notes)
+ [Examples with ORDER BY](r_Examples_with_ORDER_BY.md)

The ORDER BY clause sorts the result set of a query.

## Syntax
<a name="r_ORDER_BY_clause-synopsis"></a>

```
[ ORDER BY expression [ ASC | DESC ] ]
[ NULLS FIRST | NULLS LAST ]
[ LIMIT { count | ALL } ]
[ OFFSET start ]
```

## Parameters
<a name="r_ORDER_BY_clause-parameters"></a>

 *expression*   
Expression that defines the sort order of the query result set, typically by specifying one or more columns in the select list. Results are returned based on binary UTF-8 ordering. You can also specify the following:  
+ Columns that aren't in the select list
+ Expressions formed from one or more columns that exist in the tables referenced by the query
+ Ordinal numbers that represent the position of select list entries (or the position of columns in the table if no select list exists)
+ Aliases that define select list entries
When the ORDER BY clause contains multiple expressions, the result set is sorted according to the first expression, then the second expression is applied to rows that have matching values from the first expression, and so on.

ASC \$1 DESC   
Option that defines the sort order for the expression, as follows:   
+ ASC: ascending (for example, low to high for numeric values and 'A' to 'Z' for character strings). If no option is specified, data is sorted in ascending order by default. 
+ DESC: descending (high to low for numeric values; 'Z' to 'A' for strings). 

NULLS FIRST \$1 NULLS LAST  
Option that specifies whether NULL values should be ordered first, before non-null values, or last, after non-null values. By default, NULL values are sorted and ranked last in ASC ordering, and sorted and ranked first in DESC ordering.

LIMIT *number* \$1 ALL   <a name="order-by-clause-limit"></a>
Option that controls the number of sorted rows that the query returns. The LIMIT number must be a positive integer; the maximum value is `2147483647`.   
LIMIT 0 returns no rows. You can use this syntax for testing purposes: to check that a query runs (without displaying any rows) or to return a column list from a table. An ORDER BY clause is redundant if you are using LIMIT 0 to return a column list. The default is LIMIT ALL. 

OFFSET *start*   <a name="order-by-clause-offset"></a>
Option that specifies to skip the number of rows before *start* before beginning to return rows. The OFFSET number must be a positive integer; the maximum value is `2147483647`. When used with the LIMIT option, OFFSET rows are skipped before starting to count the LIMIT rows that are returned. If the LIMIT option isn't used, the number of rows in the result set is reduced by the number of rows that are skipped. The rows skipped by an OFFSET clause still have to be scanned, so it might be inefficient to use a large OFFSET value.

## Usage notes
<a name="r_ORDER_BY_usage_notes"></a>

 Note the following expected behavior with ORDER BY clauses: 
+ NULL values are considered "higher" than all other values. With the default ascending sort order, NULL values sort at the end. To change this behavior, use the NULLS FIRST option.
+ When a query doesn't contain an ORDER BY clause, the system returns result sets with no predictable ordering of the rows. The same query run twice might return the result set in a different order. 
+ The LIMIT and OFFSET options can be used without an ORDER BY clause; however, to return a consistent set of rows, use these options in conjunction with ORDER BY. 
+ In any parallel system like Amazon Redshift, when ORDER BY doesn't produce a unique ordering, the order of the rows is nondeterministic. That is, if the ORDER BY expression produces duplicate values, the return order of those rows might vary from other systems or from one run of Amazon Redshift to the next. 
+ Amazon Redshift doesn't support string literals in ORDER BY clauses.

# Examples with ORDER BY
<a name="r_Examples_with_ORDER_BY"></a>

Return all 11 rows from the CATEGORY table, ordered by the second column, CATGROUP. For results that have the same CATGROUP value, order the CATDESC column values by the length of the character string. Then order by columns CATID and CATNAME. 

```
select * from category order by 2, length(catdesc), 1, 3;

catid | catgroup |  catname  |                  catdesc
------+----------+-----------+----------------------------------------
10    | Concerts | Jazz      | All jazz singers and bands
9     | Concerts | Pop       | All rock and pop music concerts
11    | Concerts | Classical | All symphony, concerto, and choir conce
6     | Shows    | Musicals  | Musical theatre
7     | Shows    | Plays     | All non-musical theatre
8     | Shows    | Opera     | All opera and light opera
5     | Sports   | MLS       | Major League Soccer
1     | Sports   | MLB       | Major League Baseball
2     | Sports   | NHL       | National Hockey League
3     | Sports   | NFL       | National Football League
4     | Sports   | NBA       | National Basketball Association
(11 rows)
```

Return selected columns from the SALES table, ordered by the highest QTYSOLD values. Limit the result to the top 10 rows: 

```
select salesid, qtysold, pricepaid, commission, saletime from sales
order by qtysold, pricepaid, commission, salesid, saletime desc
limit 10;

salesid | qtysold | pricepaid | commission |      saletime
--------+---------+-----------+------------+---------------------
15401   |       8 |    272.00 |      40.80 | 2008-03-18 06:54:56
61683   |       8 |    296.00 |      44.40 | 2008-11-26 04:00:23
90528   |       8 |    328.00 |      49.20 | 2008-06-11 02:38:09
74549   |       8 |    336.00 |      50.40 | 2008-01-19 12:01:21
130232  |       8 |    352.00 |      52.80 | 2008-05-02 05:52:31
55243   |       8 |    384.00 |      57.60 | 2008-07-12 02:19:53
16004   |       8 |    440.00 |      66.00 | 2008-11-04 07:22:31
489     |       8 |    496.00 |      74.40 | 2008-08-03 05:48:55
4197    |       8 |    512.00 |      76.80 | 2008-03-23 11:35:33
16929   |       8 |    568.00 |      85.20 | 2008-12-19 02:59:33
(10 rows)
```

Return a column list and no rows by using LIMIT 0 syntax: 

```
select * from venue limit 0;
venueid | venuename | venuecity | venuestate | venueseats
---------+-----------+-----------+------------+------------
(0 rows)
```

# CONNECT BY clause
<a name="r_CONNECT_BY_clause"></a>

The CONNECT BY clause specifies the relationship between rows in a hierarchy. You can use CONNECT BY to select rows in a hierarchical order by joining the table to itself and processing the hierarchical data. For example, you can use it to recursively loop through an organization chart and list data.

Hierarchical queries process in the following order:

1. If the FROM clause has a join, it is processed first.

1. The CONNECT BY clause is evaluated.

1. The WHERE clause is evaluated.

## Syntax
<a name="r_CONNECT_BY_clause-synopsis"></a>

```
[START WITH start_with_conditions]
CONNECT BY connect_by_conditions
```

**Note**  
While START and CONNECT are not reserved words, use delimited identifiers (double quotation marks) or AS if you're using START and CONNECT as table aliases in your query to avoid failure at runtime.

```
SELECT COUNT(*)
FROM Employee "start"
CONNECT BY PRIOR id = manager_id
START WITH name = 'John'
```

```
SELECT COUNT(*)
FROM Employee AS start
CONNECT BY PRIOR id = manager_id
START WITH name = 'John'
```

## Parameters
<a name="r_CONNECT_BY_parameters"></a>

 *start\$1with\$1conditions*   
Conditions that specify the root row(s) of the hierarchy

 *connect\$1by\$1conditions*   
Conditions that specify the relationship between parent rows and child rows of the hierarchy. At least one condition must be qualified with the ` ` unary operator used to refer to the parent row.  

```
PRIOR column = expression
-- or
expression > PRIOR column
```

## Operators
<a name="r_CONNECT_BY_operators"></a>

You can use the following operators in a CONNECT BY query.

 *LEVEL*   
Pseudocolumn that returns the current row level in the hierarchy. Returns 1 for the root row, 2 for the child of the root row, and so on.

 *PRIOR*   
Unary operator that evaluates the expression for the parent row of the current row in the hierarchy.

## Examples
<a name="r_CONNECT_BY_example"></a>

The following example is a CONNECT BY query that returns the number of employees that report directly or indirectly to John, no deeper than 4 levels. 

```
SELECT id, name, manager_id
FROM employee
WHERE LEVEL < 4
START WITH name = 'John'
CONNECT BY PRIOR id = manager_id;
```

Following is the result of the query.

```
id      name      manager_id
------+----------+--------------
  101     John        100
  102     Jorge       101
  103     Kwaku       101
  110     Liu         101
  201     Sofía       102
  106     Mateo       102
  110     Nikki       103
  104     Paulo       103
  105     Richard     103
  120     Saanvi      104
  200     Shirley     104
  205     Zhang       104
```

 Table definition for this example: 

```
CREATE TABLE employee (
   id INT,
   name VARCHAR(20),
   manager_id INT
   );
```

 Following are the rows inserted into the table. 

```
INSERT INTO employee(id, name, manager_id)  VALUES
(100, 'Carlos', null),
(101, 'John', 100),
(102, 'Jorge', 101),
(103, 'Kwaku', 101),
(110, 'Liu', 101),
(106, 'Mateo', 102),
(110, 'Nikki', 103),
(104, 'Paulo', 103),
(105, 'Richard', 103),
(120, 'Saanvi', 104),
(200, 'Shirley', 104),
(201, 'Sofía', 102),
(205, 'Zhang', 104);
```

Following is an organization chart for John's department.

![\[A diagram of an organization chart for John's department.\]](http://docs.aws.amazon.com/redshift/latest/dg/images/org-chart.png)


# Subquery examples
<a name="r_Subquery_examples"></a>

The following examples show different ways in which subqueries fit into SELECT queries. See [JOIN examples](r_Join_examples.md) for another example of the use of subqueries. 

## SELECT list subquery
<a name="r_Subquery_examples-select-list-subquery"></a>

The following example contains a subquery in the SELECT list. This subquery is *scalar*: it returns only one column and one value, which is repeated in the result for each row that is returned from the outer query. The query compares the Q1SALES value that the subquery computes with sales values for two other quarters (2 and 3) in 2008, as defined by the outer query. 

```
select qtr, sum(pricepaid) as qtrsales,
(select sum(pricepaid)
from sales join date on sales.dateid=date.dateid
where qtr='1' and year=2008) as q1sales
from sales join date on sales.dateid=date.dateid
where qtr in('2','3') and year=2008
group by qtr
order by qtr;

qtr  |  qtrsales   |   q1sales
-------+-------------+-------------
2     | 30560050.00 | 24742065.00
3     | 31170237.00 | 24742065.00
(2 rows)
```

## WHERE clause subquery
<a name="r_Subquery_examples-where-clause-subquery"></a>

The following example contains a table subquery in the WHERE clause. This subquery produces multiple rows. In this case, the rows contain only one column, but table subqueries can contain multiple columns and rows, just like any other table. 

The query finds the top 10 sellers in terms of maximum tickets sold. The top 10 list is restricted by the subquery, which removes users who live in cities where there are ticket venues. This query can be written in different ways; for example, the subquery could be rewritten as a join within the main query. 

```
select firstname, lastname, city, max(qtysold) as maxsold
from users join sales on users.userid=sales.sellerid
where users.city not in(select venuecity from venue)
group by firstname, lastname, city
order by maxsold desc, city desc
limit 10;

firstname | lastname  |      city      | maxsold
-----------+-----------+----------------+---------
Noah       | Guerrero | Worcester      |       8
Isadora    | Moss     | Winooski       |       8
Kieran     | Harrison | Westminster    |       8
Heidi      | Davis    | Warwick        |       8
Sara       | Anthony  | Waco           |       8
Bree       | Buck     | Valdez         |       8
Evangeline | Sampson  | Trenton        |       8
Kendall    | Keith    | Stillwater     |       8
Bertha     | Bishop   | Stevens Point  |       8
Patricia   | Anderson | South Portland |       8
(10 rows)
```

## WITH clause subqueries
<a name="r_Subquery_examples-with-clause-subqueries"></a>

See [WITH clause](r_WITH_clause.md). 

# Correlated subqueries
<a name="r_correlated_subqueries"></a>

The following example contains a *correlated subquery* in the WHERE clause; this kind of subquery contains one or more correlations between its columns and the columns produced by the outer query. In this case, the correlation is `where s.listid=l.listid`. For each row that the outer query produces, the subquery is run to qualify or disqualify the row. 

```
select salesid, listid, sum(pricepaid) from sales s
where qtysold=
(select max(numtickets) from listing l
where s.listid=l.listid)
group by 1,2
order by 1,2
limit 5;

salesid | listid |   sum
--------+--------+----------
 27     |     28 | 111.00
 81     |    103 | 181.00
 142    |    149 | 240.00
 146    |    152 | 231.00
 194    |    210 | 144.00
(5 rows)
```

## Correlated subquery patterns that are not supported
<a name="r_correlated_subqueries-correlated-subquery-patterns-that-are-not-supported"></a>

The query planner uses a query rewrite method called subquery decorrelation to optimize several patterns of correlated subqueries for execution in an MPP environment. A few types of correlated subqueries follow patterns that Amazon Redshift can't decorrelate and doesn't support. Queries that contain the following correlation references return errors: 
+  Correlation references that skip a query block, also known as "skip-level correlation references." For example, in the following query, the block containing the correlation reference and the skipped block are connected by a NOT EXISTS predicate: 

  ```
  select event.eventname from event
  where not exists
  (select * from listing
  where not exists
  (select * from sales where event.eventid=sales.eventid));
  ```

  The skipped block in this case is the subquery against the LISTING table. The correlation reference correlates the EVENT and SALES tables. 
+  Correlation references from a subquery that is part of an ON clause in an outer query: 

  ```
  select * from category
  left join event
  on category.catid=event.catid and eventid =
  (select max(eventid) from sales where sales.eventid=event.eventid);
  ```

  The ON clause contains a correlation reference from SALES in the subquery to EVENT in the outer query. 
+ Null-sensitive correlation references to an Amazon Redshift system table. For example: 

  ```
  select attrelid
  from stv_locks sl, pg_attribute
  where sl.table_id=pg_attribute.attrelid and 1 not in
  (select 1 from pg_opclass where sl.lock_owner = opcowner);
  ```
+ Correlation references from within a subquery that contains a window function. 

  ```
  select listid, qtysold
  from sales s
  where qtysold not in
  (select sum(numtickets) over() from listing l where s.listid=l.listid);
  ```
+ References in a GROUP BY column to the results of a correlated subquery. For example: 

  ```
  select listing.listid,
  (select count (sales.listid) from sales where sales.listid=listing.listid) as list
  from listing
  group by list, listing.listid;
  ```
+ Correlation references from a subquery with an aggregate function and a GROUP BY clause, connected to the outer query by an IN predicate. (This restriction doesn't apply to MIN and MAX aggregate functions.) For example: 

  ```
  select * from listing where listid in
  (select sum(qtysold)
  from sales
  where numtickets>4
  group by salesid);
  ```

# SELECT INTO
<a name="r_SELECT_INTO"></a>

Selects rows defined by any query and inserts them into a new table. You can specify whether to create a temporary or a persistent table. 

## Syntax
<a name="r_SELECT_INTO-synopsis"></a>

```
[ WITH with_subquery [, ...] ]
SELECT
[ TOP number | [ ALL | DISTINCT ]
* | expression [ AS output_name ] [, ...] ]
[ EXCLUDE column_list ]
INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table
[ FROM table_reference [, ...] ]
[ WHERE condition ]
[ [ START WITH expression ] CONNECT BY expression ]
[ GROUP BY ALL | expression [, ...] ]
[ HAVING condition ]
[ QUALIFY condition ]
[ { UNION | ALL | INTERSECT | EXCEPT | MINUS } query ]
[ ORDER BY expression [ ASC | DESC ] ]
[ LIMIT { number | ALL } ]
[ OFFSET start ]
```

 For details about the parameters of this command, see [SELECT](r_SELECT_synopsis.md). 

## Examples
<a name="r_SELECT_INTO-examples"></a>

Select all of the rows from the EVENT table and create a NEWEVENT table: 

```
select * into newevent from event;
```

Select the result of an aggregate query into a temporary table called PROFITS: 

```
select username, lastname, sum(pricepaid-commission) as profit
into temp table profits
from sales, users
where sales.sellerid=users.userid
group by 1, 2
order by 3 desc;
```

# SET
<a name="r_SET"></a>

Sets the value of a server configuration parameter. Use the SET command to override a setting for the duration of the current session or transaction only.

Use the [RESET](r_RESET.md) command to return a parameter to its default value. 

You can change the server configuration parameters in several ways. For more information, see [Modifying the server configuration](cm_chap_ConfigurationRef.md#t_Modifying_the_default_settings). 

## Syntax
<a name="r_SET-synopsis"></a>

```
SET { [ SESSION | LOCAL ]
{ SEED | parameter_name } { TO | = }
{ value | 'value' | DEFAULT } |
SEED TO value }
```

The following statement sets the value of a session context variable.

```
SET { [ SESSION | LOCAL ]
variable_name { TO | = }
{ value | 'value'  }
```

## Parameters
<a name="r_SET-parameters"></a>

SESSION   
Specifies that the setting is valid for the current session. Default value.

*variable\$1name*   
Specifies the name of the context variable set for the session.  
The naming convention is a two-part name separated by a dot, for example *identifier.identifier*. Only one dot separator is allowed. Use an *identifier* that follows the standard identifier rules for Amazon Redshift For more information, see [Names and identifiers](r_names.md). Delimited identifiers aren't allowed.

LOCAL   
Specifies that the setting is valid for the current transaction. 

SEED TO *value*   
Sets an internal seed to be used by the RANDOM function for random number generation.  
SET SEED takes a numeric *value* between 0 and 1, and multiplies this number by (231-1) for use with the [RANDOM function](r_RANDOM.md) function. If you use SET SEED before making multiple RANDOM calls, RANDOM generates numbers in a predictable sequence.

 *parameter\$1name*   
Name of the parameter to set. See [Modifying the server configuration](cm_chap_ConfigurationRef.md#t_Modifying_the_default_settings) for information about parameters.

 *value*   
New parameter value. Use single quotation marks to set the value to a specific string. If using SET SEED, this parameter contains the SEED value. 

DEFAULT   
Sets the parameter to the default value.

## Examples
<a name="r_SET-examples"></a>

 **Changing a parameter for the current session** 

The following example sets the datestyle:

```
set datestyle to 'SQL,DMY';
```

 **Setting a query group for workload management** 

If query groups are listed in a queue definition as part of the cluster's WLM configuration, you can set the QUERY\$1GROUP parameter to a listed query group name. Subsequent queries are assigned to the associated query queue. The QUERY\$1GROUP setting remains in effect for the duration of the session or until a RESET QUERY\$1GROUP command is encountered.

This example runs two queries as part of the query group 'priority', then resets the query group. 

```
set query_group to 'priority';
select tbl, count(*)from stv_blocklist;
select query, elapsed, substring from svl_qlog order by query desc limit 5;
reset query_group;
```

For more information, see [Workload management](cm-c-implementing-workload-management.md). 

 **Change the default identity namespace for the session** 

A database user can set `default_identity_namespace`. This sample shows how to use `SET SESSION` to override the setting for the duration of the current session and then show the new identity provider value. This is used most commonly when you are using an identity provider with Redshift and IAM Identity Center. For more information about using an identity provider with Redshift, see [Connect Redshift with IAM Identity Center to give users a single sign-on experience](https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-access-control-idp-connect.html).

```
SET SESSION default_identity_namespace = 'MYCO';
         
SHOW default_identity_namespace;
```

After running the command, you can run a GRANT statement or a CREATE statement like the following:

```
GRANT SELECT ON TABLE mytable TO alice;

GRANT UPDATE ON TABLE mytable TO salesrole;
         
CREATE USER bob password 'md50c983d1a624280812631c5389e60d48c';
```

In this instance, the effect of setting the default identity namespace is equivalent to prefixing each identity with the namespace. In this example, `alice` is replaced with `MYCO:alice`. For more information about settings that pertain to Redshift configuration with IAM Identity Center, see [ALTER SYSTEM](r_ALTER_SYSTEM.md) and [ALTER IDENTITY PROVIDER](r_ALTER_IDENTITY_PROVIDER.md).

 **Setting a label for a group of queries** 

The QUERY\$1GROUP parameter defines a label for one or more queries that are run in the same session after a SET command. In turn, this label is logged when queries are run and can be used to constrain results returned from the STL\$1QUERY and STV\$1INFLIGHT system tables and the SVL\$1QLOG view. 

```
show query_group;
query_group
-------------
unset
(1 row)

set query_group to '6 p.m.';


show query_group;
query_group
-------------
6 p.m.
(1 row)

select * from sales where salesid=500;
salesid | listid | sellerid | buyerid | eventid | dateid | ...
---------+--------+----------+---------+---------+--------+-----
500 |    504 |     3858 |    2123 |    5871 |   2052 | ...
(1 row)

reset query_group;

select query, trim(label) querygroup, pid, trim(querytxt) sql
from stl_query
where label ='6 p.m.';
query | querygroup |  pid  |                  sql
-------+------------+-------+----------------------------------------
57 | 6 p.m.     | 30711 | select * from sales where salesid=500;
(1 row)
```

Query group labels are a useful mechanism for isolating individual queries or groups of queries that are run as part of scripts. You don't need to identify and track queries by their IDs; you can track them by their labels.

 **Setting a seed value for random number generation** 

The following example uses the SEED option with SET to cause the RANDOM function to generate numbers in a predictable sequence.

First, return three RANDOM integers without setting the SEED value first: 

```
select cast (random() * 100 as int);
int4
------
6
(1 row)

select cast (random() * 100 as int);
int4
------
68
(1 row)

select cast (random() * 100 as int);
int4
------
56
(1 row)
```

Now, set the SEED value to `.25`, and return three more RANDOM numbers: 

```
set seed to .25;

select cast (random() * 100 as int);
int4
------
21
(1 row)

select cast (random() * 100 as int);
int4
------
79
(1 row)

select cast (random() * 100 as int);
int4
------
12
(1 row)
```

Finally, reset the SEED value to `.25`, and verify that RANDOM returns the same results as the previous three calls: 

```
set seed to .25;

select cast (random() * 100 as int);
int4
------
21
(1 row)

select cast (random() * 100 as int);
int4
------
79
(1 row)

select cast (random() * 100 as int);
int4
------
12
(1 row)
```

The following example sets a customized context variable. 

```
SET app_context.user_id TO 123;
SET app_context.user_id TO 'sample_variable_value';
```

# SET SESSION AUTHORIZATION
<a name="r_SET_SESSION_AUTHORIZATION"></a>

Sets the user name for the current session.

You can use the SET SESSION AUTHORIZATION command, for example, to test database access by temporarily running a session or transaction as an unprivileged user. You must be a database superuser to run this command.

## Syntax
<a name="r_SET_SESSION_AUTHORIZATION-synopsis"></a>

```
SET [ LOCAL ] SESSION AUTHORIZATION { user_name | DEFAULT }
```

## Parameters
<a name="r_SET_SESSION_AUTHORIZATION-parameters"></a>

LOCAL  
Specifies that the setting is valid for the current transaction. Omitting this parameter specifies that the setting is valid for the current session.

 *user\$1name*   
Name of the user to set. The user name may be written as an identifier or a string literal.

DEFAULT  
Sets the session user name to the default value.

## Examples
<a name="r_SET_SESSION_AUTHORIZATION-examples"></a>

The following example sets the user name for the current session to `dwuser`:

```
SET SESSION AUTHORIZATION 'dwuser';
```

The following example sets the user name for the current transaction to `dwuser`:

```
SET LOCAL SESSION AUTHORIZATION 'dwuser';
```

This example sets the user name for the current session to the default user name:

```
SET SESSION AUTHORIZATION DEFAULT;
```

# SET SESSION CHARACTERISTICS
<a name="r_SET_SESSION_CHARACTERISTICS"></a>

This command is deprecated.

# SHOW
<a name="r_SHOW"></a>

Displays the current value of a server configuration parameter. This value may be specific to the current session if a SET command is in effect. For a list of configuration parameters, see [Configuration reference](cm_chap_ConfigurationRef.md).

## Syntax
<a name="r_SHOW-synopsis"></a>

```
SHOW { parameter_name | ALL }
```

The following statement displays the current value of a session context variable. If the variable doesn't exist, Amazon Redshift throws an error.

```
SHOW variable_name
```

## Parameters
<a name="r_SHOW-parameters"></a>

 *parameter\$1name*   
Displays the current value of the specified parameter.

ALL   
Displays the current values of all of the parameters.

*variable\$1name*   
Displays the current value of the specified variable.

## Examples
<a name="r_SHOW-examples"></a>

The following example displays the value for the query\$1group parameter: 

```
show query_group;

query_group

unset
(1 row)
```

The following example displays a list of all parameters and their values: 

```
show all;
name        |   setting
--------------------+--------------
datestyle          | ISO, MDY
extra_float_digits | 0
query_group        | unset
search_path        | $user,public
statement_timeout  | 0
```

The following example displays the current value of the specified variable.

```
SHOW app_context.user_id;
```

# SHOW COLUMN GRANTS
<a name="r_SHOW_COLUMN_GRANTS"></a>

Displays grants on a column within a table.

## Required permissions
<a name="r_SHOW_COLUMN_GRANTS-required-permissions"></a>

SHOW GRANTS for a target object will only display grants that are visible to the current user. A grant is visible to the current user if the current user satisfies one of the following criteria:
+ Be a superuser
+ Be the granted user
+ Be granted owner of the granted role
+ Be granted the role targeted by the object grant

## Syntax
<a name="r_SHOW_COLUMN_GRANTS-synopsis"></a>

```
SHOW COLUMN GRANTS ON TABLE
{ database_name.schema_name.table_name | schema_name.table_name }
[FOR {username | ROLE role_name | PUBLIC}]
[LIMIT row_limit]
```

## Parameters
<a name="r_SHOW_COLUMN_GRANTS-parameters"></a>

database\$1name  
The name of the database containing the target table

schema\$1name  
The name of the schema containing the target table

table\$1name  
The name of the target table

username  
Only include grants to username in the output

role\$1name  
Only include grants to role\$1name in the output

PUBLIC  
Only include grants to PUBLIC in the output

row\$1limit  
The maximum number of rows to return. The *row\$1limit* can be 0–10,000.

## Examples
<a name="r_SHOW_COLUMN_GRANTS-examples"></a>

The following example shows column grants on table demo\$1db.demo\$1schema.t100:

```
SHOW COLUMN GRANTS ON TABLE demo_db.demo_schema.t100;
 database_name | schema_name | table_name | column_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | grantor_name 
---------------+-------------+------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+--------------
 demo_db       | demo_schema | t100       | b           | COLUMN      | UPDATE         |         134 | bob           | user          | f            | COLUMN          | dbadmin
 demo_db       | demo_schema | t100       | a           | COLUMN      | SELECT         |         130 | alice         | user          | f            | COLUMN          | dbadmin
 demo_db       | demo_schema | t100       | a           | COLUMN      | UPDATE         |         130 | alice         | user          | f            | COLUMN          | dbadmin
```

The following example shows column grants on table demo\$1schema.t100 for user bob:

```
SHOW COLUMN GRANTS ON TABLE demo_schema.t100 for bob;
 database_name | schema_name | table_name | column_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | grantor_name 
---------------+-------------+------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+--------------
 demo_db       | demo_schema | t100       | b           | COLUMN      | UPDATE         |         135 | bob           | user          | f            | COLUMN          | dbadmin
```

# SHOW COLUMNS
<a name="r_SHOW_COLUMNS"></a>

Shows a list of columns in a table, along with some column attributes.

Each output row consists of a comma-separated list of database name, schema name, table name, column name, ordinal position, column default, is nullable, data type, character maximum length, numeric precision, remarks, sort key type, sort key order, distribution key, encoding, and collation. For more information about these attributes, see [SVV\$1ALL\$1COLUMNS](r_SVV_ALL_COLUMNS.md).

If more than 10,000 columns would result from the SHOW COLUMNS command, then an error is returned.

## Required permissions
<a name="r_SHOW_COLUMNS-privileges"></a>

To view a column in an Amazon Redshift table, the current user must satisfy one of the following criteria:
+ Be a superuser.
+ Be the owner of the table.
+ Granted USAGE privilege on the parent schema and granted SELECT privilege on the table or granted SELECT privilege on the column.

## Syntax
<a name="r_SHOW_COLUMNS-synopsis"></a>

```
SHOW COLUMNS FROM TABLE database_name.schema_name.table_name [LIKE 'filter_pattern'] [LIMIT row_limit ]
```

## Parameters
<a name="r_SHOW_COLUMNS-parameters"></a>

 *database\$1name*   
The name of the database that contains the tables to list.   
To show tables in an AWS Glue Data Catalog, specify (`awsdatacatalog`) as the database name, and ensure the system configuration `data_catalog_auto_mount` is set to `true`. For more information, see [ALTER SYSTEM](r_ALTER_SYSTEM.md).

 *schema\$1name*   
The name of the schema that contains the tables to list.   
To show AWS Glue Data Catalog tables, provide the AWS Glue database name as the schema name.

 *table\$1name*   
The name of the table that contains the columns to list. 

 *filter\$1pattern*   
A valid UTF-8 character expression with a pattern to match table names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_COLUMNS.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator. 

 *row\$1limit*   
The maximum number of rows to return. The *row\$1limit* can be 0–10,000. 

## Examples
<a name="r_SHOW_COLUMNS-examples"></a>

Following example shows the columns in the Amazon Redshift database named `sample_data_dev` that are in schema `tickit` and table `event`.

```
SHOW COLUMNS FROM TABLE demo_schema.compound_sort_table;

  database_name | schema_name |     table_name      | column_name | ordinal_position | column_default | is_nullable |     data_type     | character_maximum_length | numeric_precision | numeric_scale | remarks | sort_key_type | sort_key | dist_key | encoding | collation 
---------------+-------------+---------------------+-------------+------------------+----------------+-------------+-------------------+--------------------------+-------------------+---------------+---------+---------------+----------+----------+----------+-----------
 demo_db       | demo_schema | compound_sort_table | id          |                1 |                | YES         | integer           |                          |                32 |             0 |         | COMPOUND      |        1 |        1 | delta32k | 
 demo_db       | demo_schema | compound_sort_table | name        |                2 |                | YES         | character varying |                       50 |                   |               |         | COMPOUND      |        2 |          | lzo      | default
 demo_db       | demo_schema | compound_sort_table | date_col    |                3 |                | YES         | date              |                          |                   |               |         |               |        0 |          | delta    | 
 demo_db       | demo_schema | compound_sort_table | amount      |                4 |                | YES         | numeric           |                          |                10 |             2 |         |               |        0 |          | mostly16 |
```

Following example shows the tables in the AWS Glue Data Catalog database named `awsdatacatalog` that are in schema `batman` and table `nation`. Output is limited to `2` rows.

```
SHOW COLUMNS FROM TABLE second_db.public.t22;

 database_name | schema_name | table_name | column_name | ordinal_position | column_default | is_nullable |          data_type          | character_maximum_length | numeric_precision | numeric_scale | remarks | sort_key_type | sort_key | dist_key | encoding | collation 
---------------+-------------+------------+-------------+------------------+----------------+-------------+-----------------------------+--------------------------+-------------------+---------------+---------+---------------+----------+----------+----------+-----------
 second_db     | public      | t22        | col1        |                1 |                | YES         | integer                     |                          |                32 |             0 |         | INTERLEAVED   |       -1 |          | mostly8  | 
 second_db     | public      | t22        | col2        |                2 |                | YES         | character varying           |                      100 |                   |               |         | INTERLEAVED   |        2 |          | text255  | default
 second_db     | public      | t22        | col3        |                3 |                | YES         | timestamp without time zone |                          |                   |               |         |               |        0 |          | raw      | 
 second_db     | public      | t22        | col4        |                4 |                | YES         | numeric                     |                          |                10 |             2 |         |               |        0 |          | az64     |
```

# SHOW CONSTRAINTS
<a name="r_SHOW_CONSTRAINTS"></a>

Shows a list of primary key and foreign key constraints in a table.

## Required permissions
<a name="r_SHOW_CONSTRAINTS-required-permissions"></a>

To execute SHOW CONSTRAINTS on a table, the current user must satisfy one of the following criteria:
+ Be a superuser
+ Be the owner of the table
+ Be granted USAGE privilege on the parent schema and SELECT privilege on the table

## Syntax
<a name="r_SHOW_CONSTRAINTS-synopsis"></a>

```
SHOW CONSTRAINTS {PRIMARY KEYS | FOREIGN KEYS [EXPORTED]}
FROM TABLE
{ database_name.schema_name.table_name | schema_name.table_name }
[LIMIT row_limit]
```

## Parameters
<a name="r_SHOW_CONSTRAINTS-parameters"></a>

*database\$1name*  
The name of the database containing the target table

*schema\$1name*  
The name of the schema containing the target table

*table\$1name*  
The name of the target table

EXPORTED  
When EXPORTED is specified, then list all foreign keys from other tables that reference the target table.

*row\$1limit*  
The maximum number of rows to return. The *row\$1limit* can be 0–10,000.

## Examples
<a name="r_SHOW_CONSTRAINTS-examples"></a>

The following example shows primary key constraints from table demo\$1db.demo\$1schema.pk1:

```
SHOW CONSTRAINTS PRIMARY KEYS FROM TABLE demo_db.demo_schema.pk1;
 database_name | schema_name | table_name | pk_name  | column_name | key_seq 
---------------+-------------+------------+----------+-------------+---------
 demo_db       | demo_schema | pk1        | pk1_pkey | i           |       1
 demo_db       | demo_schema | pk1        | pk1_pkey | j           |       2
 demo_db       | demo_schema | pk1        | pk1_pkey | c           |       3
```

The following example shows foreign key constraints from table demo\$1schema.fk2:

```
SHOW CONSTRAINTS FOREIGN KEYS FROM TABLE demo_schema.fk2;
 pk_database_name | pk_schema_name | pk_table_name | pk_column_name | fk_database_name | fk_schema_name | fk_table_name | fk_column_name | key_seq |  fk_name   | pk_name  | update_rule | delete_rule | deferrability 
------------------+----------------+---------------+----------------+------------------+----------------+---------------+----------------+---------+------------+----------+-------------+-------------+---------------
 demo_db          | demo_schema    | pk1           | i              | demo_db          | demo_schema    | fk2           | i              |       1 | fk2_i_fkey | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | j              | demo_db          | demo_schema    | fk2           | j              |       2 | fk2_i_fkey | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | c              | demo_db          | demo_schema    | fk2           | c              |       3 | fk2_i_fkey | pk1_pkey |             |             |
```

The following example shows exported foreign key constraints from table demo\$1schema.pk1:

```
SHOW CONSTRAINTS FOREIGN KEYS EXPORTED FROM TABLE demo_schema.pk1;
 pk_database_name | pk_schema_name | pk_table_name | pk_column_name | fk_database_name | fk_schema_name | fk_table_name | fk_column_name | key_seq |     fk_name     | pk_name  | update_rule | delete_rule | deferrability 
------------------+----------------+---------------+----------------+------------------+----------------+---------------+----------------+---------+-----------------+----------+-------------+-------------+---------------
 demo_db          | demo_schema    | pk1           | i              | demo_db          | demo_schema    | fk2           | i              |       1 | fk2_i_fkey      | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | j              | demo_db          | demo_schema    | fk2           | j              |       2 | fk2_i_fkey      | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | c              | demo_db          | demo_schema    | fk2           | c              |       3 | fk2_i_fkey      | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | i              | demo_db          | demo_schema    | other_fk      | i              |       1 | other_fk_i_fkey | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | j              | demo_db          | demo_schema    | other_fk      | j              |       2 | other_fk_i_fkey | pk1_pkey |             |             | 
 demo_db          | demo_schema    | pk1           | c              | demo_db          | demo_schema    | other_fk      | c              |       3 | other_fk_i_fkey | pk1_pkey |             |             |
```

# SHOW EXTERNAL TABLE
<a name="r_SHOW_EXTERNAL_TABLE"></a>

Shows the definition of an external table, including table attributes and column attributes. You can use the output of the SHOW EXTERNAL TABLE statement to recreate the table. 

For more information about external table creation, see [CREATE EXTERNAL TABLE](r_CREATE_EXTERNAL_TABLE.md). 

## Syntax
<a name="r_SHOW_EXTERNAL_TABLE-synopsis"></a>

```
SHOW EXTERNAL TABLE [external_database].external_schema.table_name [ PARTITION ]
```

## Parameters
<a name="r_SHOW_EXTERNAL_TABLE-parameters"></a>

 *external\$1database*   
The name of the associated external database. This parameter is optional.

 *external\$1schema*   
The name of the associated external schema. 

 *table\$1name*   
The name of the table to show. 

PARTITION   
Displays ALTER TABLE statements to add partitions to the table definition. 

## Examples
<a name="r_SHOW_EXTERNAL_TABLE-examples"></a>

The following examples are based on an external table defined as follows:

```
CREATE EXTERNAL TABLE my_schema.alldatatypes_parquet_test_partitioned (
     csmallint smallint,
     cint int,
     cbigint bigint,
     cfloat float4,
     cdouble float8,
     cchar char(10),
     cvarchar varchar(255),
     cdecimal_small decimal(18,9),
     cdecimal_big decimal(30,15),
     ctimestamp TIMESTAMP,
     cboolean boolean,
     cstring varchar(16383)
)
PARTITIONED BY (cdate date, ctime TIMESTAMP)
STORED AS PARQUET
LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned';
```

Following is an example of the SHOW EXTERNAL TABLE command and output for the table `my_schema.alldatatypes_parquet_test_partitioned`.

```
SHOW EXTERNAL TABLE my_schema.alldatatypes_parquet_test_partitioned;
```

```
"CREATE EXTERNAL TABLE my_schema.alldatatypes_parquet_test_partitioned (
    csmallint smallint,
    cint int,
    cbigint bigint,
    cfloat float4,
    cdouble float8,
    cchar char(10),
    cvarchar varchar(255),
    cdecimal_small decimal(18,9),
    cdecimal_big decimal(30,15),
    ctimestamp timestamp,
    cboolean boolean,
    cstring varchar(16383)
)
PARTITIONED BY (cdate date, ctime timestamp)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned';"
```

Following is an example of the SHOW EXTERNAL TABLE command and output for the same table, but with the database also specified in the parameter.

```
SHOW EXTERNAL TABLE my_database.my_schema.alldatatypes_parquet_test_partitioned;
```

```
"CREATE EXTERNAL TABLE my_database.my_schema.alldatatypes_parquet_test_partitioned (
    csmallint smallint,
    cint int,
    cbigint bigint,
    cfloat float4,
    cdouble float8,
    cchar char(10),
    cvarchar varchar(255),
    cdecimal_small decimal(18,9),
    cdecimal_big decimal(30,15),
    ctimestamp timestamp,
    cboolean boolean,
    cstring varchar(16383)
)
PARTITIONED BY (cdate date, ctime timestamp)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned';"
```

Following is an example of the SHOW EXTERNAL TABLE command and output when using the `PARTITION` parameter. The output contains ALTER TABLE statements to add partitions to the table definition.

```
SHOW EXTERNAL TABLE my_schema.alldatatypes_parquet_test_partitioned PARTITION;
```

```
"CREATE EXTERNAL TABLE my_schema.alldatatypes_parquet_test_partitioned (
    csmallint smallint,
    cint int,
    cbigint bigint,
    cfloat float4,
    cdouble float8,
    cchar char(10),
    cvarchar varchar(255),
    cdecimal_small decimal(18,9),
    cdecimal_big decimal(30,15),
    ctimestamp timestamp,
    cboolean boolean,
    cstring varchar(16383)
)
PARTITIONED BY (cdate date)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned';
ALTER TABLE my_schema.alldatatypes_parquet_test_partitioned ADD IF NOT EXISTS PARTITION (cdate='2021-01-01') LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned2/cdate=2021-01-01';
ALTER TABLE my_schema.alldatatypes_parquet_test_partitioned ADD IF NOT EXISTS PARTITION (cdate='2021-01-02') LOCATION 's3://amzn-s3-demo-bucket/alldatatypes_parquet_partitioned2/cdate=2021-01-02';"
```

# SHOW DATABASES
<a name="r_SHOW_DATABASES"></a>

Displays databases from a Data Catalog or an Amazon Redshift data warehouse. SHOW DATABASES lists all accessible databases, such as, within the data warehouse, AWS Glue Data Catalog databases (awsdatacatalog), data sharing databases, and Lake Formation databases.

## Required permissions
<a name="r_SHOW_DATABASES-privileges"></a>

All databases are visible to users except:
+ For databases created from a datashare with permissions to be visible, the current user must be granted USAGE permission on the database.

## Syntax
<a name="r_SHOW_DATABASES-syntax"></a>

To show databases from an Amazon Redshift data warehouse:

```
SHOW DATABASES 
[ LIKE '<expression>' ]
[ LIMIT row_limit ]
```

To show databases from a Data Catalog:

```
SHOW DATABASES FROM DATA CATALOG 
[ ACCOUNT  '<id1>', '<id2>', ... ]
[ LIKE '<expression>' ]
[ IAM_ROLE default | 'SESSION' | 'arn:aws:iam::<account-id>:role/<role-name>' ]
[ LIMIT row_limit ]
```

## Parameters
<a name="r_SHOW_DATABASES-parameters"></a>

ACCOUNT '<id1>', '<id2>', ...   
The AWS Glue Data Catalog accounts from which to list databases. Omitting this parameter indicates that Amazon Redshift should show the databases from the account that owns the cluster.

LIKE '<expression>'  
Filters the list of databases to those that match the expression that you specify. This parameter supports patterns that use the wildcard characters % (percent) and \$1 (underscore).

IAM\$1ROLE default \$1 'SESSION' \$1 'arn:aws:iam::<account-id>:role/<role-name>'  
If you specify an IAM role that is associated with the cluster when running the SHOW DATABASES command, Amazon Redshift will use the role’s credentials when you run queries on the database.  
Specifying the `default` keyword means to use the IAM role that's set as the default and associated with the cluster.  
Use `'SESSION'` if you connect to your Amazon Redshift cluster using a federated identity and access the tables from the external database created using the [CREATE DATABASE](r_CREATE_DATABASE.md) command. For an example of using a federated identity, see [Using a federated identity to manage Amazon Redshift access to local resources and Amazon Redshift Spectrum external tables](https://docs.aws.amazon.com/redshift/latest/mgmt/authorization-fas-spectrum.html), which explains how to configure federated identity.   
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. As a minimum, the IAM role must have permission to perform a LIST operation on the Amazon S3 bucket to be accessed and a GET operation on the Amazon S3 objects the bucket contains. To learn more about databases created from the AWS Glue Data Catalog for datashares and using IAM\$1ROLE, see [Working with Lake Formation-managed datashares as a consumer](https://docs.aws.amazon.com/redshift/latest/dg/lake-formation-getting-started-consumer.html).  
The following shows the syntax for the IAM\$1ROLE parameter string for a single ARN.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-name>'
```
You can chain roles so that your cluster can assume another IAM role, possibly belonging to another account. You can chain up to 10 roles. For more information, see [Chaining IAM roles in Amazon Redshift Spectrum](c-spectrum-iam-policies.md#c-spectrum-chaining-roles).   
 To this IAM role, attach an IAM permissions policy similar to the following.    
****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "AccessSecret",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetResourcePolicy",
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecretVersionIds"
            ],
            "Resource": "arn:aws:secretsmanager:us-west-2:123456789012:secret:my-rds-secret-VNenFy"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetRandomPassword",
                "secretsmanager:ListSecrets"
            ],
            "Resource": "*"
        }
    ]
}
```
For the steps to create an IAM role to use with federated query, see [Creating a secret and an IAM role to use federated queries](federated-create-secret-iam-role.md).   
Don't include spaces in the list of chained roles.
The following shows the syntax for chaining three roles.  

```
IAM_ROLE 'arn:aws:iam::<aws-account-id>:role/<role-1-name>,arn:aws:iam::<aws-account-id>:role/<role-2-name>,arn:aws:iam::<aws-account-id>:role/<role-3-name>'
```

LIMIT *row\$1limit*  
Clause to LIMIT the number of rows returned. Where *row\$1limit* is the maximum number of rows to return. The *row\$1limit* can be 0–10,000.

## Examples
<a name="r_SHOW_DATABASES-examples"></a>

The following example displays all of the Data Catalog databases from the account ID 123456789012.

```
SHOW DATABASES FROM DATA CATALOG ACCOUNT '123456789012'

  catalog_id  | database_name |                        database_arn                    |     type     |                                             target_database                                      | location | parameters
--------------+---------------+--------------------------------------------------------+--------------+--------------------------------------------------------------------------------------------------+----------+------------
 123456789012 |   database1   | arn:aws:glue:us-east-1:123456789012:database/database1 | Data Catalog |                                                                                                  |          |
 123456789012 |   database2   | arn:aws:glue:us-east-1:123456789012:database/database2 | Data Catalog | arn:aws:redshift:us-east-1:123456789012:datashare:035c45ea-61ce-86f0-8b75-19ac6102c3b7/database2 |          |
```

The following are examples that demonstrate how to display all of the Data Catalog databases from the account ID 123456789012 while using an IAM role's credentials.

```
SHOW DATABASES FROM DATA CATALOG ACCOUNT '123456789012' IAM_ROLE default;
```

```
SHOW DATABASES FROM DATA CATALOG ACCOUNT '123456789012' IAM_ROLE <iam-role-arn>;
```

The following example displays all of the databases in the connected Amazon Redshift data warehouse.

```
SHOW DATABASES

database_name  | database_owner | database_type        | database_acl | parameters | database_isolation_level
---------------+----------------+----------------------+--------------+------------+--------------------
awsdatacatalog | 1              | auto mounted catalog | NULL         | UNKNOWN    | UNKNOWN
dev            | 1              | local                | NULL         | NULL       | Snapshot Isolation
```

# SHOW FUNCTIONS
<a name="r_SHOW_FUNCTIONS"></a>

Shows a list of functions in a schema, along with information about the listed objects.

Each output row has columns database\$1name, schema\$1name, function\$1name, number\$1of\$1arguments, argument\$1list, return\$1type, remarks.

If more than 10,000 rows would results from SHOW FUNCTIONS, then the command raises an error.

## Required permissions
<a name="r_SHOW_FUNCTIONS-required-permissions"></a>

To view a function in a Redshift schema, the current user must satisfy one of the following criteria:
+ Be a superuser
+ Be the owner of the function
+ Granted USAGE privilege on the parent schema and granted EXECUTE on the function

## Syntax
<a name="r_SHOW_FUNCTIONS-synopsis"></a>

```
SHOW FUNCTIONS FROM SCHEMA
[database_name.]schema_name
[LIKE 'filter_pattern'] [LIMIT row_limit]
```

## Parameters
<a name="r_SHOW_FUNCTIONS-parameters"></a>

*database\$1name*  
The name of the database that contains the functions to list.

*schema\$1name*  
The name of the schema that contains the functions to list.

*filter\$1pattern*  
A valid UTF-8 character expression with a pattern to match function names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_FUNCTIONS.html)
Note that the filter\$1pattern only matches the function name.

*row\$1limit*  
The maximum number of rows to return. The *row\$1limit* can be 0–10,000.

## Examples
<a name="r_SHOW_FUNCTIONS-examples"></a>

The following example shows functions from schema demo\$1db.demo\$1schema:

```
SHOW FUNCTIONS FROM SCHEMA demo_db.demo_schema;
 database_name | schema_name |    function_name     | number_of_arguments |                                  argument_list                                  |    return_type    | remarks 
---------------+-------------+----------------------+---------------------+---------------------------------------------------------------------------------+-------------------+---------
 demo_db       | demo_schema | f2                   |                   6 | integer, character varying, numeric, date, timestamp without time zone, boolean | character varying | 
 demo_db       | demo_schema | f_calculate_discount |                   2 | numeric, integer                                                                | numeric           | 
 demo_db       | demo_schema | f_days_between       |                   2 | date, date                                                                      | integer           |
```

The following example shows functions from schema demo\$1schema with names ending in 'discount':

```
SHOW FUNCTIONS FROM SCHEMA demo_schema like '%discount';
 database_name | schema_name |    function_name     | number_of_arguments |  argument_list   | return_type | remarks 
---------------+-------------+----------------------+---------------------+------------------+-------------+---------
 demo_db       | demo_schema | f_calculate_discount |                   2 | numeric, integer | numeric     |
```

# SHOW GRANTS
<a name="r_SHOW_GRANTS"></a>

Displays grants for a user, role, or object. The object can be a database, a schema, a table, a function, or a template. When you specify an object, such as a table or function, you need to qualify it with two-part or three-part notation. For example, `schema_name.table_name` or `database_name.schema_name.table_name`.

If more than 10,000 rows would results from SHOW GRANTS, then the command raises an error.

## Required permissions
<a name="r_SHOW_GRANTS-permissions"></a>

To run SHOW GRANTS for a target user or role, the current user must satisfy one of the following criteria:
+ Be a superuser
+ Be the target user
+ Be the owner of the target role
+ Be granted the role

SHOW GRANTS for a target object will only display grants that are visible to the current user. A grant is visible to the current user if the current user satisfies one of the following criteria:
+ Be a superuser
+ Be the target user
+ Be granted owner of the granted role
+ Be granted the role targeted by the object grant

## Syntax
<a name="r_SHOW_GRANTS-syntax"></a>

The following is the syntax for showing grants on an object. Note that the second way of specifying a function is only valid for external schemas and databases created from a datashare.

```
SHOW GRANTS ON
{
 DATABASE database_name |
 FUNCTION {database_name.schema_name.function_name | schema_name.function_name } ( [ [ argname ] argtype [, ...] ] ) |
 FUNCTION {database_name.schema_name.function_name | schema_name.function_name } |
 SCHEMA {database_name.schema_name | schema_name} | 
 { TABLE {database_name.schema_name.table_name | schema_name.table_name} | table_name }
 TEMPLATE {database_name.schema_name.template_name | template_name}
}
[FOR {username | ROLE role_name | PUBLIC}]
[LIMIT row_limit]
```

The following is the syntax for showing grants for a user or role. 

```
SHOW GRANTS FOR
{username | ROLE role_name}
[FROM DATABASE database_name]
[LIMIT row_limit]
```

## Parameters
<a name="r_SHOW_GRANTS-parameters"></a>

 *database\$1name*   
The name of the database to show grants on.

 *function\$1name*   
The name of the function to show grants on.

template\$1name  
The name of the template to show grants on.

 *schema\$1name*   
The name of the schema to show grants on.

 *table\$1name*   
The name of the table to show grants on.

FOR *username*   
Indicates showing grants for a user.

FOR ROLE *role\$1name*   
Indicates showing grants for a role.

FOR PUBLIC  
Indicates showing grants for PUBLIC.

 *row\$1limit*   
The maximum number of rows to return. The *row\$1limit* can be 0–10,000. 

## Examples
<a name="r_SHOW_GRANTS-examples"></a>

The following example displays all grants on a database named `dev`.

```
SHOW GRANTS on database demo_db;

  database_name | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | grantor_name 
---------------+----------------+-------------+---------------+---------------+--------------+-----------------+--------------
 demo_db       | ALTER          |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | TRUNCATE       |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | DROP           |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | INSERT         |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | TEMP           |           0 | public        | public        | f            | DATABASE        | dbadmin
 demo_db       | SELECT         |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | UPDATE         |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | DELETE         |         112 | alice         | user          | f            | TABLES          | dbadmin
 demo_db       | REFERENCES     |         112 | alice         | user          | f            | TABLES          | dbadmin
```

The following command shows all grants on a schema named `demo`.

```
SHOW GRANTS ON SCHEMA demo_schema;

 schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | database_name | grantor_name 
-------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+---------------+--------------
 demo_schema | demo_schema | SCHEMA      | ALTER          |         112 | alice         | user          | f            | SCHEMA          | db1           | dbadmin
 demo_schema | demo_schema | SCHEMA      | DROP           |         112 | alice         | user          | f            | SCHEMA          | db1           | dbadmin
 demo_schema | demo_schema | SCHEMA      | USAGE          |         112 | alice         | user          | f            | SCHEMA          | db1           | dbadmin
 demo_schema | demo_schema | SCHEMA      | CREATE         |         112 | alice         | user          | f            | SCHEMA          | db1           | dbadmin
```

The following command shows all grants for a user named `alice`.

```
SHOW GRANTS FOR alice;

 database_name | schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | privilege_scope | grantor_name 
---------------+-------------+-------------+-------------+----------------+-------------+---------------+---------------+-----------------+--------------
 demo_db       |             |             | DATABASE    | INSERT         |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | SELECT         |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | UPDATE         |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | DELETE         |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | REFERENCES     |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | DROP           |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | TRUNCATE       |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       |             |             | DATABASE    | ALTER          |         124 | alice         | user          | TABLES          | dbadmin
 demo_db       | demo_schema |             | SCHEMA      | USAGE          |         124 | alice         | user          | SCHEMA          | dbadmin
 demo_db       | demo_schema |             | SCHEMA      | CREATE         |         124 | alice         | user          | SCHEMA          | dbadmin
 demo_db       | demo_schema |             | SCHEMA      | DROP           |         124 | alice         | user          | SCHEMA          | dbadmin
 demo_db       | demo_schema |             | SCHEMA      | ALTER          |         124 | alice         | user          | SCHEMA          | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | INSERT         |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | SELECT         |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | UPDATE         |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | DELETE         |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | RULE           |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | REFERENCES     |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | TRIGGER        |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | DROP           |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | TRUNCATE       |         124 | alice         | user          | TABLE           | dbadmin
 demo_db       | demo_schema | t1          | TABLE       | ALTER          |         124 | alice         | user          | TABLE           | dbadmin
```

```
SHOW GRANTS FOR alice FROM DATABASE second_db;
 database_name | schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | privilege_scope | grantor_name 
---------------+-------------+-------------+-------------+----------------+-------------+---------------+---------------+-----------------+--------------
 second_db     | public      | t22         | TABLE       | SELECT         |         101 | alice         | user          | TABLE           | dbadmin
```

The following command shows all grants on a table named `t3` for a user named `alice`. Note that you can either use two-part or three-part notation to specify the table name.

```
SHOW GRANTS ON TABLE demo_db.demo_schema.t3 FOR ALICE;
 schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | database_name | grantor_name 
-------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+---------------+--------------
 demo_schema | t3          | TABLE       | ALTER          |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | TRUNCATE       |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | DROP           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | TRIGGER        |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | SELECT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | INSERT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | UPDATE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | DELETE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | RULE           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | REFERENCES     |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin


SHOW GRANTS ON TABLE demo_schema.t3 FOR ALICE;
 schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | database_name | grantor_name 
-------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+---------------+--------------
 demo_schema | t3          | TABLE       | ALTER          |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | TRUNCATE       |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | DROP           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | TRIGGER        |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | SELECT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | INSERT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | UPDATE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | DELETE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | RULE           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 demo_schema | t3          | TABLE       | REFERENCES     |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
```

The following example shows all grants on a table named `t4`. Note the different ways that you can specify the table name.

```
SHOW GRANTS ON t4;
 schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | database_name | grantor_name 
-------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+---------------+--------------
 public      | t4          | TABLE       | ALTER          |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | TRUNCATE       |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | DROP           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | TRIGGER        |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | SELECT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | INSERT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | UPDATE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | DELETE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | RULE           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | REFERENCES     |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 
SHOW GRANTS ON TABLE public.t4;
 schema_name | object_name | object_type | privilege_type | identity_id | identity_name | identity_type | admin_option | privilege_scope | database_name | grantor_name 
-------------+-------------+-------------+----------------+-------------+---------------+---------------+--------------+-----------------+---------------+--------------
 public      | t4          | TABLE       | ALTER          |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | TRUNCATE       |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | DROP           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | TRIGGER        |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | SELECT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | INSERT         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | UPDATE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | DELETE         |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | RULE           |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
 public      | t4          | TABLE       | REFERENCES     |         130 | alice         | user          | f            | TABLE           | demo_db       | dbadmin
```

# SHOW MODEL
<a name="r_SHOW_MODEL"></a>

Shows useful information about a machine learning model, including its status, the parameters used to create it and the prediction function with its input argument types. You can use the information from the SHOW MODEL to recreate the model. If base tables have changed, running CREATE MODEL with the same SQL statement results in a different model. The information returned by the SHOW MODEL is different for the model owner and a user with the EXECUTE privilege. SHOW MODEL shows different outputs when a model is trained from Amazon Redshift or when the model is a BYOM model.

## Syntax
<a name="r_SHOW_MODEL-synopsis"></a>

```
SHOW MODEL ( ALL | model_name )
```

## Parameters
<a name="r_SHOW_MODEL-parameters"></a>

ALL   
Returns all the models that the user can use and their schemas.

 *model\$1name*   
The name of the model. The model name in a schema must be unique.

## Usage notes
<a name="r_SHOW_MODEL_usage_notes"></a>

The SHOW MODEL command returns the following: 
+ The model name.
+ The schema where the model was created.
+ The owner of the model.
+ The model creation time.
+ The status of the model, such as READY, TRAINING, or FAILED.
+ The reason message for a failed model.
+ The validation error if model has finished training.
+ The estimated cost needed to derive the model for a non-BYOM approach. Only the owner of the model can view this information.
+ A list of user-specified parameters and their values, specifically the following:
  + The specified TARGET column.
  + The model type, AUTO or XGBoost.
  + The problem type, such as REGRESSION, BINARY\$1CLASSIFICATION, MULTICLASS\$1CLASSIFICATION. This parameter is specific to AUTO.
  + The name of the Amazon SageMaker AI training job or the Amazon SageMaker AI Autopilot job that created the model. You can use this job name to find more information about the model on Amazon SageMaker AI.
  + The objective, such as MSE, F1, Accuracy. This parameter is specific to AUTO.
  + The name of the created function.
  + The type of inference, local or remote.
  + The prediction function input arguments.
  + The prediction function input argument types for models that aren't bring your own model (BYOM).
  + The return type of the prediction function. This parameter is specific to BYOM.
  + The name of the Amazon SageMaker AI endpoint for a BYOM model with remote inference.
  + The IAM role. Only the owner of the model can see this.
  + The S3 bucket used. Only the owner of the model can see this.
  + The AWS KMS key, if one was provided. Only the owner of the model can see this.
  + The maximum time that the model can run.
+ If the model type is not AUTO, then Amazon Redshift also shows the list of hyperparameters provided and their values.

You can also view some of the information provided by SHOW MODEL in other catalog tables, such as pg\$1proc. Amazon Redshift returns information about the prediction function that is registered in pg\$1proc catalog table. This information includes the input argument names and their types for the prediction function. Amazon Redshift returns the same information in the SHOW MODEL command.

```
SELECT * FROM pg_proc WHERE proname ILIKE '%<function_name>%';
```

## Examples
<a name="r_SHOW_MODEL-examples"></a>

The following example shows the show model output.

```
SHOW MODEL ALL;

Schema Name |  Model Name
------------+---------------
 public     | customer_churn
```

The owner of the customer\$1churn can see the following output. A user with only the EXECUTE privilege can't see the IAM role, the Amazon S3 bucket, and the estimated cost of the mode.

```
SHOW MODEL customer_churn;

       Key                 |           Value
---------------------------+-----------------------------------
 Model Name                | customer_churn
 Schema Name               | public
 Owner                     | 'owner'
 Creation Time             | Sat, 15.01.2000 14:45:20
 Model State               | READY
 validation:F1             | 0.855
 Estimated Cost            | 5.7
                           |
 TRAINING DATA:            |
 Table                     | customer_data
 Target Column             | CHURN
                           |
 PARAMETERS:               |
 Model Type                | auto
 Problem Type              | binary_classification
 Objective                 | f1
 Function Name             | predict_churn
 Function Parameters       | age zip average_daily_spend average_daily_cases
 Function Parameter Types  | int int float float
 IAM Role                  | 'iam_role'
 KMS Key                   | 'kms_key'
 Max Runtime               | 36000
```

# SHOW DATASHARES
<a name="r_SHOW_DATASHARES"></a>

Displays the inbound and outbound shares in a cluster either from the same account or across accounts. If you don't specify a datashare name, then Amazon Redshift displays all datashares in all databases in the cluster. Users who have the ALTER and SHARE privileges can see the shares that they have privileges for. 

## Syntax
<a name="r_SHOW_DATASHARES-synopsis"></a>

```
SHOW DATASHARES [ LIKE 'namepattern' ] 
```

## Parameters
<a name="r_SHOW_DATASHARES-parameters"></a>

LIKE  
An optional clause that compares the specified name pattern to the description of the datashare. When this clause is used, Amazon Redshift displays only the datashares with names that match the specified name pattern.

*namepattern*  
The name of the datashare requested or part of the name to be matched using wildcard characters.

## Examples
<a name="r_SHOW_DATASHARES-examples"></a>

The following example displays the inbound and outbound shares in a cluster. 

```
SHOW DATASHARES;
SHOW DATASHARES LIKE 'sales%';

share_name   | share_owner | source_database | consumer_database | share_type | createdate          | is_publicaccessible | share_acl | producer_account |           producer_namespace
-------------+-------------+-----------------+-------------------+------------+---------------------+---------------------+-----------+------------------+---------------------------------------
'salesshare' | 100         | dev             |                   | outbound   | 2020-12-09 01:22:54.| False               |           |   123456789012   | 13b8833d-17c6-4f16-8fe4-1a018f5ed00d
```

# SHOW PARAMETERS
<a name="r_SHOW_PARAMETERS"></a>

Shows a list of parameters for a function/procedure, along with some information about the parameters.

Each output row has columns database\$1name, schema\$1name, procedure\$1name or function\$1name, parameter\$1name, ordinal\$1position, parameter\$1type (IN/OUT), data\$1type, character\$1maximum\$1length, numeric\$1precision, numeric\$1scale, and remarks.

## Required permissions
<a name="r_SHOW_PARAMETERS-required-permissions"></a>

To view a function/procedure in a redshift schema, the current user must satisfy one of the following criteria:
+ Be a superuser
+ Be the owner of the function
+ Granted USAGE privilege on the parent schema and granted EXECUTE on the function

## Syntax
<a name="r_SHOW_PARAMETERS-synopsis"></a>

```
SHOW PARAMETERS OF {FUNCTION| PROCEDURE}
[database_name.]schema_name.function_name(argtype [, ...] )
[LIKE 'filter_pattern'];
```

## Parameters
<a name="r_SHOW_PARAMETERS-parameters"></a>

*database\$1name*  
The name of the database that contains the function to list.

*schema\$1name*  
The name of the schema that contains the function to list.

*filter\$1pattern*  
A valid UTF-8 character expression with a pattern to match table names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_PARAMETERS.html)

## Examples
<a name="r_SHOW_PARAMETERS-examples"></a>

The following example shows parameters of procedure demo\$1db.demo\$1schema.f1:

```
SHOW PARAMETERS OF PROCEDURE demo_db.demo_schema.f1(VARCHAR, DECIMAL, DECIMAL, DECIMAL);
 database_name | schema_name | procedure_name |  parameter_name  | ordinal_position | parameter_type |          data_type          | character_maximum_length | numeric_precision | numeric_scale 
---------------+-------------+----------------+------------------+------------------+----------------+-----------------------------+--------------------------+-------------------+---------------
 demo_db       | demo_schema | f1             | operation        |                1 | IN             | character varying           |                       10 |                   |              
 demo_db       | demo_schema | f1             | value1           |                2 | IN             | numeric                     |                          |                18 |             0
 demo_db       | demo_schema | f1             | value2           |                3 | IN             | numeric                     |                          |                18 |             0
 demo_db       | demo_schema | f1             | result           |                4 | INOUT          | numeric                     |                          |                18 |             0
 demo_db       | demo_schema | f1             | operation_status |                5 | OUT            | character varying           |                       50 |                   |              
 demo_db       | demo_schema | f1             | calculation_time |                6 | OUT            | timestamp without time zone |                          |                   |              
 demo_db       | demo_schema | f1             | is_successful    |                7 | OUT            | boolean                     |                          |                   |
```

The following example shows parameters of procedure demo\$1schema.f1 with names starting with 'val':

```
SHOW PARAMETERS OF PROCEDURE demo_schema.f1(VARCHAR, DECIMAL, DECIMAL, DECIMAL) like 'val%';
 database_name | schema_name | procedure_name | parameter_name | ordinal_position | parameter_type | data_type | character_maximum_length | numeric_precision | numeric_scale 
---------------+-------------+----------------+----------------+------------------+----------------+-----------+--------------------------+-------------------+---------------
 demo_db       | demo_schema | f1             | value1         |                2 | IN             | numeric   |                          |                18 |             0
 demo_db       | demo_schema | f1             | value2         |                3 | IN             | numeric   |                          |                18 |             0
```

The following example shows parameters of function demo\$1schema.f2:

```
SHOW PARAMETERS OF FUNCTION demo_schema.f2(INT, VARCHAR, DECIMAL, DATE, TIMESTAMP, BOOLEAN);
 database_name | schema_name | function_name | parameter_name  | ordinal_position | parameter_type |          data_type          | character_maximum_length | numeric_precision | numeric_scale 
---------------+-------------+---------------+-----------------+------------------+----------------+-----------------------------+--------------------------+-------------------+---------------
 demo_db       | demo_schema | f2            |                 |                0 | RETURN         | character varying           |                       -1 |                   |              
 demo_db       | demo_schema | f2            | int_param       |                1 | IN             | integer                     |                          |                32 |             0
 demo_db       | demo_schema | f2            | varchar_param   |                2 | IN             | character varying           |                       -1 |                   |              
 demo_db       | demo_schema | f2            | decimal_param   |                3 | IN             | numeric                     |                          |                   |              
 demo_db       | demo_schema | f2            | date_param      |                4 | IN             | date                        |                          |                   |              
 demo_db       | demo_schema | f2            | timestamp_param |                5 | IN             | timestamp without time zone |                          |                   |              
 demo_db       | demo_schema | f2            | boolean_param   |                6 | IN             | boolean                     |                          |                   |
```

# SHOW POLICIES
<a name="r_SHOW_POLICIES"></a>

Displays the row-level security (RLS) and dynamic data masking (DDM) policies defined in a database, as well as the RLS and DDM policies applied to specific relations. Only a superuser or a user with the `sys:secadmin` role on the database can view the results of these policies.

## Syntax
<a name="r_SHOW_POLICIES-synopsis"></a>

```
SHOW { RLS | MASKING } POLICIES
[
    ON { database_name.schema_name.relation_name
       | schema_name.relation_name
       }
    [ FOR { user_name | ROLE role_name | PUBLIC } ]
  |
    FROM DATABASE database_name
]
[ LIMIT row_limit ];
```

## Parameters
<a name="r_SHOW_POLICIES-parameters"></a>

*database\$1name*  
The name of the database to show policies from.

*schema\$1name*  
Schema name of the relation to show attached policies on.

*relation\$1name*  
The name of the relation to show attached policies on.

*user\$1name*  
The name of the user for whom the policy is attached on relation.

*role\$1name*  
The name of the role for which the policy is attached on relation.

*row\$1limit*  
The maximum number of rows to return. The *row\$1limit* can be 0–10,000.

**Note**  
Show policies from a database different from the connected database is supported on Amazon Redshift Federated Permissions Catalog. SHOW POLICIES command supports cross-database queries for all databases in warehouses with Amazon Redshift federated permissions

## Examples
<a name="r_SHOW_POLICIES-examples"></a>

The following command shows RLS policies from the connected database.

```
SHOW RLS POLICIES;

  policy_name   | policy_alias |                           policy_atts                            |                                                                  policy_qual                                                                         | policy_enabled | policy_modified_by |    policy_modified_time    
----------------+--------------+------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------------------+----------------------------
 policy_america | rls_table    | [{"colname":"region","type":"character varying(10)"}]            | (("rls_table"."region" = CAST('USA' AS TEXT)) OR ("rls_table"."region" = CAST('CANADA' AS TEXT)) OR ("rls_table"."region" = CAST('Mexico' AS TEXT))) | t              | admin              | 2025-11-07 14:57:27
```

The following command shows masking policies from the database "sales\$1db.finance-catalog";

```
SHOW MASKING POLICIES FROM DATABASE "sales_db@finance-catalog";

  policy_name  |                          input_columns                           |                                                  policy_expression                                                  | policy_modified_by |    policy_modified_time    
---------------+------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+--------------------+----------------------------
 hash_credit   | [{"colname":"credit_card","type":"character varying(256)"}]      | [{"expr":"SHA2((\"masked_table\".\"credit_card\" + CAST('testSalt' AS TEXT)), CAST(256 AS INT4))","type":"text"}]   | admin              | 2025-11-07 16:05:54
 hash_username | [{"colname":"username","type":"character varying(256)"}]         | [{"expr":"SHA2((\"masked_table\".\"username\" + CAST('otherTestSalt' AS TEXT)), CAST(256 AS INT4))","type":"text"}] | admin              | 2025-11-07 16:07:08
(2 rows)
```

The following command shows RLS policies attached on the relation sales\$1table;

```
SHOW RLS POLICIES ON sales_schema.sales_table;

  policy_name   | schema_name  | relation_name | relation_kind | grantor  |          grantee          | grantee_kind | is_policy_on | is_rls_on | rls_conjunction_type 
----------------+--------------+---------------+---------------+----------+---------------------------+--------------+--------------+-----------+----------------------
 policy_global  | sales_schema | sales_table   | table         | admin    | sales_analyst_role_global | role         | t            | t         | and
 policy_america | sales_schema | sales_table   | table         | admin    | sales_analyst_usa         | user         | t            | t         | and
```

The following command shows masking policies attached on the relation transaction\$1table from the database "sales\$1db.finance-catalog".

```
SHOW MASKING POLICIES ON "sales_db@finance-catalog".sales_schema.transaction_table LIMIT 1;

  policy_name  | schema_name  |   relation_name   | relation_type | grantor  |         grantee          | grantee_type | priority |   input_columns   |   output_columns   
---------------+--------------+-------------------+---------------+----------+--------------------------+--------------+----------+-------------------+-------------------
 hash_username | sales_schema | transaction_table | table         | admin    | transaction_analyst_role | role         |      100 | ["user_name"]     | ["user_name"]
```

The following command shows RLS policies attached on the relation sales\$1table from the database "sales\$1db.finance-catalog" for the user "IAMR:sales\$1analyst\$1usa".

```
SHOW RLS POLICIES ON "sales_db@finance-catalog".sales_schema.sales_table FOR "IAMR:sales_analyst_usa";

  policy_name   | schema_name  | relation_name | relation_kind | grantor  |      grantee           | grantee_kind | is_policy_on | is_rls_on | rls_conjunction_type 
----------------+--------------+---------------+---------------+----------+------------------------+--------------+--------------+-----------+----------------------
 policy_america | sales_schema | sales_table   | table         | admin    | IAMR:sales_analyst_usa | user         | t            | t         | and
```

The following command shows RLS policies attached on the relation transaction\$1table from the database "sales\$1db.finance-catalog" for the role transaction\$1analyst\$1role.

```
SHOW MASKING POLICIES ON sales_schema.transaction_table FOR ROLE transaction_analyst_role;

  policy_name  | schema_name  |   relation_name   | relation_type | grantor  |         grantee          | grantee_type | priority | input_columns | output_columns 
---------------+--------------+-------------------+---------------+----------+--------------------------+--------------+----------+---------------+----------------
 hash_username | sales_schema | transaction_table | table         | admin    | transaction_analyst_role | role         |      100 | ["user_name"] | ["user_name"]
```

# SHOW PROCEDURE
<a name="r_SHOW_PROCEDURE"></a>

Shows the definition of a given stored procedure, including its signature. You can use the output of a SHOW PROCEDURE to recreate the stored procedure. 

## Syntax
<a name="r_SHOW_PROCEDURE-synopsis"></a>

```
SHOW PROCEDURE sp_name [( [ [ argname ] [ argmode ] argtype [, ...] ] )]
```

## Parameters
<a name="r_SHOW_PROCEDURE-parameters"></a>

 *sp\$1name*   
The name of the procedure to show. 

*[argname] [ argmode] argtype*   
Input argument types to identify the stored procedure. Optionally, you can include the full argument data types, including OUT arguments. This part is optional if the name of the stored procedure is unique (that is, not overloaded).

## Examples
<a name="r_SHOW_PROCEDURE-examples"></a>

The following example shows the definition of the procedure `test_spl2`.

```
show procedure test_sp2(int, varchar);
                                        Stored Procedure Definition
------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE PROCEDURE public.test_sp2(f1 integer, INOUT f2 character varying, OUT character varying)
LANGUAGE plpgsql
AS $_$
DECLARE
out_var alias for $3;
loop_var int;
BEGIN
IF f1 is null OR f2 is null THEN
RAISE EXCEPTION 'input cannot be null';
END IF;
CREATE TEMP TABLE etl(a int, b varchar);
FOR loop_var IN 1..f1 LOOP
insert into etl values (loop_var, f2);
f2 := f2 || '+' || f2;
END LOOP;
SELECT INTO out_var count(*) from etl;
END;
$_$

(1 row)
```

# SHOW PROCEDURES
<a name="r_SHOW_PROCEDURES"></a>

Shows a list of procedures in a schema, along with information about the listed objects.

Each output row has columns `database_name`, `schema_name`, `procedure_name`, `number_of_arguments`, `argument_list`, `return_type`, remarks.

If more than 10,000 rows would results from SHOW PROCEDURES, then the command raises an error.

## Required permissions
<a name="r_SHOW_PROCEDURES-required-permissions"></a>

To view a procedure in a Redshift schema, the current user must satisfy one of the following criteria:
+ Be a superuser
+ Be the owner of the procedure
+ Granted USAGE privilege on the parent schema and granted EXECUTE on the procedure

## Syntax
<a name="r_SHOW_PROCEDURES-synopsis"></a>

```
SHOW PROCEDURES FROM SCHEMA
[database_name.]schema_name
[LIKE 'filter_pattern'] [LIMIT row_limit]
```

## Parameters
<a name="r_SHOW_PROCEDURES-parameters"></a>

database\$1name  
The name of the database that contains the procedures to list.

schema\$1name  
The name of the schema that contains the procedures to list.

filter\$1pattern  
A valid UTF-8 character expression with a pattern to match procedure names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_PROCEDURES.html)
Note that the filter\$1pattern only matches the procedure name.

row\$1limit  
The maximum number of rows to return. The *row\$1limit* can be 0–10,000.

## Examples
<a name="r_SHOW_PROCEDURES-examples"></a>

The following example shows procedures from schema demo\$1db.demo\$1schema:

```
SHOW PROCEDURES FROM SCHEMA demo_db.demo_schema;
 database_name | schema_name |  procedure_name   | number_of_arguments |                argument_list                 |                           return_type                            | remarks 
---------------+-------------+-------------------+---------------------+----------------------------------------------+------------------------------------------------------------------+---------
 demo_db       | demo_schema | f1                |                   4 | character varying, numeric, numeric, numeric | numeric, character varying, timestamp without time zone, boolean | 
 demo_db       | demo_schema | sp_get_result_set |                   2 | integer, refcursor                           | refcursor                                                        | 
 demo_db       | demo_schema | sp_process_data   |                   2 | numeric, numeric                             | numeric, character varying                                       |
```

The following example shows procedures from schema demo\$1schema with names ending in 'data':

```
SHOW PROCEDURES FROM SCHEMA demo_schema like '%data';
 database_name | schema_name | procedure_name  | number_of_arguments |  argument_list   |        return_type         | remarks 
---------------+-------------+-----------------+---------------------+------------------+----------------------------+---------
 demo_db       | demo_schema | sp_process_data |                   2 | numeric, numeric | numeric, character varying |
```

# SHOW SCHEMAS
<a name="r_SHOW_SCHEMAS"></a>

Shows a list of schemas in a database, along with some schema attributes.

Each output row consists of database name, schema name, schema owner, schema type, schema ACL, source database, and schema option. For more information about these attributes, see [SVV\$1ALL\$1SCHEMAS](r_SVV_ALL_SCHEMAS.md).

If more than 10,000 schemas can result from the SHOW SCHEMAS command, then an error is returned.

## Required permissions
<a name="r_SHOW_SCHEMAS-privileges"></a>

To view a schema in an Amazon Redshift table, the current user must satisfy one of the following criteria:
+ Be a superuser.
+ Be the owner of the schema.
+ Granted USAGE privilege on the schema.

## Syntax
<a name="r_SHOW_SCHEMAS-synopsis"></a>

```
SHOW SCHEMAS FROM DATABASE database_name [LIKE 'filter_pattern'] [LIMIT row_limit ]
```

## Parameters
<a name="r_SHOW_SCHEMAS-parameters"></a>

 *database\$1name*   
The name of the database that contains the tables to list.   
To show tables in an AWS Glue Data Catalog, specify (`awsdatacatalog`) as the database name, and ensure the system configuration `data_catalog_auto_mount` is set to `true`. For more information, see [ALTER SYSTEM](r_ALTER_SYSTEM.md).

 *filter\$1pattern*   
A valid UTF-8 character expression with a pattern to match schema names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_SCHEMAS.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator. 

 *row\$1limit*   
The maximum number of rows to return. The *row\$1limit* can be 0–10,000. 

## Examples
<a name="r_SHOW_SCHEMAS-examples"></a>

Following example shows the schemas from the Amazon Redshift database named `dev` .

```
SHOW SCHEMAS FROM DATABASE dev;

 database_name |     schema_name      | schema_owner | schema_type |         schema_acl          | source_database | schema_option 
---------------+----------------------+--------------+-------------+-----------------------------+-----------------+---------------
 dev           | pg_automv            |            1 | local       |                             |                 | 
 dev           | pg_catalog           |            1 | local       | jpuser=UC/jpuser~=U/jpuser  |                 | 
 dev           | public               |            1 | local       | jpuser=UC/jpuser~=UC/jpuser |                 | 
 dev           | information_schema   |            1 | local       | jpuser=UC/jpuser~=U/jpuser  |                 | 
 dev           | schemad79cd6d93bf043 |            1 | local       |                             |                 |
```

Following example shows the schemas in the AWS Glue Data Catalog database named `awsdatacatalog`. The maximum number of output rows is `5`.

```
SHOW SCHEMAS FROM DATABASE awsdatacatalog LIMIT 5;

 database_name  |     schema_name      | schema_owner | schema_type | schema_acl | source_database | schema_option 
----------------+----------------------+--------------+-------------+------------+-----------------+---------------
 awsdatacatalog | 000_too_many_glue_db |              | EXTERNAL    |            |                 | 
 awsdatacatalog | 123_default          |              | EXTERNAL    |            |                 | 
 awsdatacatalog | adhoc                |              | EXTERNAL    |            |                 | 
 awsdatacatalog | all_shapes_10mb      |              | EXTERNAL    |            |                 | 
 awsdatacatalog | all_shapes_1g        |              | EXTERNAL    |            |                 |
```

# SHOW TABLE
<a name="r_SHOW_TABLE"></a>

Shows the definition of a table, including table attributes, table constraints, column attributes, column collation and column constraints. You can use the output of the SHOW TABLE statement to recreate the table. 

For more information on table creation, see [CREATE TABLE](r_CREATE_TABLE_NEW.md). 

## Syntax
<a name="r_SHOW_TABLE-synopsis"></a>

```
SHOW TABLE [schema_name.]table_name 
```

## Parameters
<a name="r_SHOW_TABLE-parameters"></a>

 *schema\$1name*   
(Optional) The name of the related schema. 

 *table\$1name*   
The name of the table to show. 

## Examples
<a name="r_SHOW_TABLE-examples"></a>

Following is an example of the SHOW TABLE output for the table `sales`.

```
show table sales;
```

```
CREATE TABLE public.sales (
salesid integer NOT NULL ENCODE az64,
listid integer NOT NULL ENCODE az64 distkey,
sellerid integer NOT NULL ENCODE az64,
buyerid integer NOT NULL ENCODE az64,
eventid integer NOT NULL ENCODE az64,
dateid smallint NOT NULL,
qtysold smallint NOT NULL ENCODE az64,
pricepaid numeric(8,2) ENCODE az64,
commission numeric(8,2) ENCODE az64,
saletime timestamp without time zone ENCODE az64
)
DISTSTYLE KEY SORTKEY ( dateid );
```

Following is an example of the SHOW TABLE output for the table `category` in the schema `public`. The collation of the database is CASE\$1SENSITIVE.

```
show table public.category;
```

```
CREATE TABLE public.category (
catid smallint NOT NULL distkey,
catgroup character varying(10) ENCODE lzo COLLATE case_sensitive,
catname character varying(10) ENCODE lzo COLLATE case_sensitive,
catdesc character varying(50) ENCODE lzo COLLATE case_sensitive
) 
DISTSTYLE KEY SORTKEY ( catid );
```

The following example creates table `foo` with a primary key.

```
create table foo(a int PRIMARY KEY, b int);
```

The SHOW TABLE results display the create statement with all properties of the `foo` table.

```
show table foo;
```

```
CREATE TABLE public.foo ( 
a integer NOT NULL ENCODE az64, 
b integer ENCODE az64, PRIMARY KEY (a) 
) 
DISTSTYLE AUTO;
```

In this example, we create a table where column `a` inherits the database's default CASE\$1SENSITIVE collation, while `b` and `c` are explicitly set to CASE\$1INSENSITIVE collation.

```
CREATE TABLE public.foo (
a CHAR, 
b VARCHAR(10) COLLATE CASE_INSENSITIVE, 
c SUPER COLLATE CASE_INSENSITIVE
);
```

The SHOW TABLE results display the create statement with all properties of the `foo` table.

```
show table public.foo;
```

```
CREATE TABLE public.foo (
a character(1) ENCODE lzo COLLATE case_sensitive,
b character varying(10) ENCODE lzo COLLATE case_insensitive,
c super COLLATE case_insensitive
)
DISTSTYLE AUTO;
```

# SHOW TABLES
<a name="r_SHOW_TABLES"></a>

Shows a list of tables in a schema, along with some table attributes.

Each output row consists of database name, schema name, table name, table type, table ACL, remarks, table owner, last altered time, last modified time, dist\$1style, and table sub-type. For more information about these attributes, see [SVV\$1ALL\$1TABLES](r_SVV_ALL_TABLES.md).

The modification and alteration timestamps can lag behind the table updates by approximately 20 minutes.

If more than 10,000 tables would result from the SHOW TABLES command, then an error is returned.

## Required permissions
<a name="r_SHOW_TABLES-privileges"></a>

To view a table in an Amazon Redshift schema, the current user must satisfy one of the following criteria:
+ Be a superuser.
+ Be the owner of the table.
+ Granted USAGE privilege on the parent schema and granted SELECT privilege on the table or granted SELECT privilege on any column in the table.

## Syntax
<a name="r_SHOW_TABLES-synopsis"></a>

```
SHOW TABLES FROM SCHEMA database_name.schema_name [LIKE 'filter_pattern'] [LIMIT row_limit ]
```

## Parameters
<a name="r_SHOW_TABLES-parameters"></a>

 *database\$1name*   
The name of the database that contains the tables to list.   
To show tables in an AWS Glue Data Catalog, specify (`awsdatacatalog`) as the database name, and ensure the system configuration `data_catalog_auto_mount` is set to `true`. For more information, see [ALTER SYSTEM](r_ALTER_SYSTEM.md).

 *schema\$1name*   
The name of the schema that contains the tables to list.   
To show AWS Glue Data Catalog tables, provide the AWS Glue database name as the schema name.

 *filter\$1pattern*   
A valid UTF-8 character expression with a pattern to match table names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_TABLES.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator. 

 *row\$1limit*   
The maximum number of rows to return. The *row\$1limit* can be 0–10,000. 

## Examples
<a name="r_SHOW_TABLES-examples"></a>

```
SHOW TABLES FROM SCHEMA s1;

 database_name | schema_name |    table_name     | table_type |              table_acl              | remarks | owner |     last_altered_time      |     last_modified_time     | dist_style |   table_subtype   
---------------+-------------+-------------------+------------+-------------------------------------+---------+-------+----------------------------+----------------------------+------------+-------------------
 dev           | s1          | late_binding_view | VIEW       | alice=arwdRxtDPA/alice~bob=d/alice  |         | alice |                            |                            |            | LATE BINDING VIEW
 dev           | s1          | manual_mv         | VIEW       | alice=arwdRxtDPA/alice~bob=P/alice  |         | alice |                            |                            |            | MATERIALIZED VIEW
 dev           | s1          | regular_view      | VIEW       | alice=arwdRxtDPA/alice~bob=r/alice  |         | alice |                            |                            |            | REGULAR VIEW
 dev           | s1          | test_table        | TABLE      | alice=arwdRxtDPA/alice~bob=rw/alice |         | alice | 2025-11-18 15:52:00.010452 | 2025-11-18 15:44:34.856073 | AUTO (ALL) | REGULAR TABLE
```

```
SHOW TABLES FROM SCHEMA dev.s1 LIKE '%view' LIMIT 1;

 database_name | schema_name |    table_name     | table_type |              table_acl               | remarks | owner | last_altered_time | last_modified_time | dist_style |   table_subtype   
---------------+-------------+-------------------+------------+--------------------------------------+---------+-------+-------------------+--------------------+------------+-------------------
 dev           | s1          | late_binding_view | VIEW       | {alice=arwdRxtDPA/alice,bob=d/alice} |         | alice |                   |                    |            | LATE BINDING VIEW
```

# SHOW TEMPLATE
<a name="r_SHOW_TEMPLATE"></a>

Displays the complete definition of a template, including the fully qualified name (database, schema, and template name) and all parameters. The output is a valid CREATE TEMPLATE statement that you can use to recreate the template or create a similar template with modifications. 

For more information on template creation, see [CREATE TEMPLATE](r_CREATE_TEMPLATE.md). 

## Required permissions
<a name="r_SHOW_TEMPLATE-privileges"></a>

To view a template definition, you must have one of the following:
+ Superuser privileges
+ USAGE privilege on the template and USAGE privilege on the schema containing the template

## Syntax
<a name="r_SHOW_TEMPLATE-synopsis"></a>

```
SHOW TEMPLATE [database_name.][schema_name.]template_name;
```

## Parameters
<a name="r_SHOW_TEMPLATE-parameters"></a>

 *database\$1name*   
(Optional) The name of the database in which the template is created. If not specified, the current database is used. 

 *schema\$1name*   
(Optional) The name of the schema in which the template is created. If not specified, the template is searched for in the current search path. 

 *template\$1name*   
The name of the template. 

## Examples
<a name="r_SHOW_TEMPLATE-examples"></a>

The following is an example of the SHOW TEMPLATE output for the template `test_template`:

```
CREATE TEMPLATE test_template FOR COPY AS NOLOAD DELIMITER ',' ENCODING UTF16 ENCRYPTED;
```

```
SHOW TEMPLATE test_template;

CREATE OR REPLACE TEMPLATE dev.public.test_template FOR COPY AS ENCRYPTED NOLOAD ENCODING UTF16 DELIMITER ',';
```

The following example creates template `demo_template` in schema `demo_schema`.

```
CREATE OR REPLACE TEMPLATE demo_schema.demo_template FOR COPY AS
ACCEPTANYDATE ACCEPTINVCHARS DATEFORMAT 'DD-MM-YYYY' EXPLICIT_IDS ROUNDEC
TIMEFORMAT  AS 'DD.MM.YYYY HH:MI:SS' TRUNCATECOLUMNS NULL  AS 'null_string';
```

```
SHOW TEMPLATE demo_schema.demo_template;

CREATE OR REPLACE TEMPLATE dev.demo_schema.demo_template FOR COPY AS TRUNCATECOLUMNS NULL 'null_string' EXPLICIT_IDS TIMEFORMAT 'DD.MM.YYYY HH:MI:SS' ACCEPTANYDATE ROUNDEC ACCEPTINVCHARS DATEFORMAT 'DD-MM-YYYY';
```

# SHOW TEMPLATES
<a name="r_SHOW_TEMPLATES"></a>

Shows a list of templates in a schema, along with their attributes.

Each output row consists of template name, template id, template type, template owner, database name, schema name, create time, last modified time, and last modified by. 

For complete template details, including template parameters, see [SYS\$1REDSHIFT\$1TEMPLATE](SYS_REDSHIFT_TEMPLATE.md).

## Required permissions
<a name="r_SHOW_TEMPLATES-privileges"></a>

To view templates in an Amazon Redshift schema, you must have one of the following:
+ Superuser privileges
+ USAGE privilege on the schema containing the templates

## Syntax
<a name="r_SHOW_TEMPLATES-synopsis"></a>

```
SHOW TEMPLATES FROM SCHEMA [database_name.]schema_name [LIKE 'filter_pattern'] [LIMIT row_limit ];
```

## Parameters
<a name="r_SHOW_TEMPLATES-parameters"></a>

 *database\$1name*   
(Optional) The name of the database containing the templates to list. If not provided, uses the current database.

 *schema\$1name*   
The name of the schema that contains the templates to list. 

 *filter\$1pattern*   
(Optional) A valid UTF-8 character expression with a pattern to match template names. The LIKE option performs a case-sensitive match that supports the following pattern-matching metacharacters:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/redshift/latest/dg/r_SHOW_TEMPLATES.html)
If *filter\$1pattern* does not contain metacharacters, then the pattern only represents the string itself; in that case LIKE acts the same as the equals operator. 

 *row\$1limit*   
The maximum number of rows to return. Valid range is 0 to the template limit on the cluster (default is 1000).

## Examples
<a name="r_SHOW_TEMPLATES-examples"></a>

```
SHOW TEMPLATES FROM SCHEMA s1;

 template_name          | template_id | template_type | template_owner | database_name | schema_name |        create_time         |     last_modified_time     | last_modified_by
------------------------+-------------+---------------+----------------+---------------+-------------+----------------------------+----------------------------+------------------
 template_maxerror      |      107685 | COPY          | alice          | dev           | s1          | 2025-12-16 19:31:10.514076 | 2025-12-16 19:31:10.514076 |              100
 json_template          |      107687 | COPY          | alice          | dev           | s1          | 2025-12-16 19:31:33.229566 | 2025-12-16 19:31:33.229567 |              100
 noload_template        |      107686 | COPY          | alice          | dev           | s1          | 2025-12-16 19:31:17.370547 | 2025-12-16 19:31:17.370547 |              100
 csv_delimiter_template |      107688 | COPY          | alice          | dev           | s1          | 2025-12-16 19:31:42.354044 | 2025-12-16 19:31:42.354045 |              100
```

```
SHOW TEMPLATES FROM SCHEMA dev.s1 LIKE '%template' LIMIT 1;

 template_name  | template_id | template_type | template_owner | database_name | schema_name |        create_time         |     last_modified_time     | last_modified_by 
-----------------+-------------+---------------+----------------+---------------+-------------+----------------------------+----------------------------+------------------
 noload_template |      107686 | COPY          | alice          | dev           | s1          | 2025-12-16 19:31:17.370547 | 2025-12-16 19:31:17.370547 |              100
```

# SHOW VIEW
<a name="r_SHOW_VIEW"></a>

Shows the definition of a view, including for materialized views and late-binding views. You can use the output of the SHOW VIEW statement to recreate the view. 

## Syntax
<a name="r_SHOW_VIEW-synopsis"></a>

```
SHOW VIEW [schema_name.]view_name 
```

## Parameters
<a name="r_SHOW_VIEW-parameters"></a>

 *schema\$1name*   
(Optional) The name of the related schema. 

 *view\$1name*   
The name of the view to show. 

## Examples
<a name="r_SHOW_VIEW-examples"></a>

 Following is the view definition for the view `LA_Venues_v`.

```
create view LA_Venues_v as select * from venue where venuecity='Los Angeles';
```

Following is an example of the SHOW VIEW command and output for the view defined preceding.

```
show view LA_Venues_v;
```

```
SELECT venue.venueid,
venue.venuename,
venue.venuecity,
venue.venuestate,
venue.venueseats
FROM venue WHERE ((venue.venuecity)::text = 'Los Angeles'::text);
```

Following is the view definition for the view `public.Sports_v` in the schema `public`.

```
create view public.Sports_v as select * from category where catgroup='Sports';
```

Following is an example of the SHOW VIEW command and output for the view defined preceding.

```
show view public.Sports_v;
```

```
SELECT category.catid,
category.catgroup,
category.catname,
category.catdesc
FROM category WHERE ((category.catgroup)::text = 'Sports'::text);
```

# START TRANSACTION
<a name="r_START_TRANSACTION"></a>

Synonym of the BEGIN function. 

See [BEGIN](r_BEGIN.md). 

# TRUNCATE
<a name="r_TRUNCATE"></a>

Deletes all of the rows from a table without doing a table scan: this operation is a faster alternative to an unqualified DELETE operation. To run a TRUNCATE command, you must have the TRUNCATE permission for the table, be the owner of the table, or be a superuser. To grant permissions to truncate a table, use the [GRANT](r_GRANT.md) command.

TRUNCATE is much more efficient than DELETE and doesn't require a VACUUM and ANALYZE. However, be aware that TRUNCATE commits the transaction in which it is run.

## Syntax
<a name="r_TRUNCATE-synopsis"></a>

```
TRUNCATE [ TABLE ] table_name
```

The command also works on a materialized view.

```
TRUNCATE materialized_view_name
```

## Parameters
<a name="r_TRUNCATE-parameters"></a>

TABLE   
Optional keyword. 

 *table\$1name*   
A temporary or persistent table. Only the owner of the table or a superuser may truncate it.   
You can truncate any table, including tables that are referenced in foreign-key constraints.   
You don't need to vacuum a table after truncating it. 

 *materialized\$1view\$1name*   
A materialized view.  
You can truncate a materialized view that is used for [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md). 

## Usage notes
<a name="r_TRUNCATE_usage_notes"></a>
+  The TRUNCATE command commits the transaction in which it is run; therefore, you can't roll back a TRUNCATE operation, and a TRUNCATE command may commit other operations when it commits itself. 
+ TRUNCATE operations hold exclusive locks when run on Amazon Redshift streaming materialized views connected to any of the following:
  +  An Amazon Kinesis data stream 
  +  An Amazon Managed Streaming for Apache Kafka topic 
  +  A supported external stream, such as a Confluent Cloud Kafka topic 

  For more information, see [Streaming ingestion to a materialized view](materialized-view-streaming-ingestion.md).

## Examples
<a name="r_TRUNCATE-examples"></a>

Use the TRUNCATE command to delete all of the rows from the CATEGORY table: 

```
truncate category;
```

Attempt to roll back a TRUNCATE operation: 

```
begin;

truncate date;

rollback;

select count(*) from date;
count
-------
0
(1 row)
```

The DATE table remains empty after the ROLLBACK command because the TRUNCATE command committed automatically. 

The following example uses the TRUNCATE command to delete all of the rows from a materialized view. 

```
truncate my_materialized_view;
```

It deletes all records in the materialized view and leaves the materialized view and its schema intact. In the query, the materialized view name is a sample.

# UNLOAD
<a name="r_UNLOAD"></a>


|  | 
| --- |
|  Client-side encryption for COPY and UNLOAD commands will no longer be open to new customers starting April 30, 2025. If you used client-side encryption with COPY and UNLOAD commands in the 12 months before April 30, 2025, you can continue to use client side encryption with COPY or UNLOAD commands until April 30, 2026. After April 30, 2026, you won't be able to use client-side encryption for COPY and UNLOAD. We recommend that you switch to using server-side encryption for COPY and UNLOAD as soon as possible. If you're already using server-side encryption for COPY and UNLOAD, there's no change and you can continue to use it without altering your queries. For more information on encryption for COPY and UNLOAD, see the ENCRYPTED parameter below.  | 

Unloads the result of a query to one or more text, JSON, or Apache Parquet files on Amazon S3, using Amazon S3 server-side encryption (SSE-S3). You can also specify server-side encryption with an AWS Key Management Service key (SSE-KMS).

By default, the format of the unloaded file is pipe-delimited ( `|` ) text.

You can manage the size of files on Amazon S3, and by extension the number of files, by setting the MAXFILESIZE parameter. Ensure that the S3 IP ranges are added to your allow list. To learn more about the required S3 IP ranges, see [ Network isolation](https://docs.aws.amazon.com//redshift/latest/mgmt/security-network-isolation.html#network-isolation).

You can unload the result of an Amazon Redshift query to your Amazon S3 data lake in Apache Parquet, an efficient open columnar storage format for analytics. Parquet format is up to 2x faster to unload and consumes up to 6x less storage in Amazon S3, compared with text formats. This enables you to save data transformation and enrichment you have done in Amazon S3 into your Amazon S3 data lake in an open format. You can then analyze your data with Redshift Spectrum and other AWS services such as Amazon Athena, Amazon EMR, and Amazon SageMaker AI. 

For more information and example scenarios about using the UNLOAD command, see [Unloading data in Amazon Redshift](c_unloading_data.md).

## Required privileges and permissions
<a name="r_UNLOAD-permissions"></a>

For the UNLOAD command to succeed, at least SELECT privilege on the data in the database is needed, along with permission to write to the Amazon S3 location. For information about permissions to access AWS resources for the UNLOAD command, see [Permissions to access other AWS Resources](copy-usage_notes-access-permissions.md).

To apply least-privilege permissions, follow these recommendations to only grant permissions, as needed, to the user running the command.
+ The user must have SELECT privilege on the data. For information about how to limit database privileges, see [GRANT](r_GRANT.md).
+ The user needs permission to assume the IAM role to write to the Amazon S3 bucket in your AWS account. To restrict access for a database user to assume a role, see [Restricting access to IAM roles](https://docs.aws.amazon.com/redshift/latest/mgmt/authorizing-redshift-service-database-users.html) in the *Amazon Redshift Management Guide*.
+ The user needs access to the Amazon S3 bucket. To restrict permission using an Amazon S3 bucket policy, see [Bucket policies for Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-policies.html) in the *Amazon Simple Storage Service User Guide*.

## Syntax
<a name="r_UNLOAD-synopsis"></a>

```
UNLOAD ('select-statement')
TO 's3://object-path/name-prefix'
authorization
[ option, ...] 

where authorization is
IAM_ROLE { default | 'arn:aws:iam::<AWS account-id-1>:role/<role-name>[,arn:aws:iam::<AWS account-id-2>:role/<role-name>][,...]' }
            
where option is
| [ FORMAT [ AS ] ] CSV | PARQUET | JSON
| PARTITION BY ( column_name [, ... ] ) [ INCLUDE ]
| MANIFEST [ VERBOSE ]
| HEADER
| DELIMITER [ AS ] 'delimiter-char'
| FIXEDWIDTH [ AS ] 'fixedwidth-spec'
| ENCRYPTED [ AUTO ]
| BZIP2
| GZIP
| ZSTD
| ADDQUOTES
| NULL [ AS ] 'null-string'
| ESCAPE
| ALLOWOVERWRITE
| CLEANPATH
| PARALLEL [ { ON | TRUE } | { OFF | FALSE } ]
| MAXFILESIZE [AS] max-size [ MB | GB ]
| ROWGROUPSIZE [AS] size [ MB | GB ]
| REGION [AS] 'aws-region' }
| EXTENSION 'extension-name'
```

## Parameters
<a name="unload-parameters"></a>

('*select-statement*')   
A SELECT query. The results of the query are unloaded. In most cases, it is worthwhile to unload data in sorted order by specifying an ORDER BY clause in the query. This approach saves the time required to sort the data when it is reloaded.   
The query must be enclosed in single quotation marks as shown following:   

```
('select * from venue order by venueid')
```
If your query contains quotation marks (for example to enclose literal values), put the literal between two sets of single quotation marks—you must also enclose the query between single quotation marks:   

```
('select * from venue where venuestate=''NV''')
```

TO 's3://*object-path/name-prefix*'   
The full path, including bucket name, to the location on Amazon S3 where Amazon Redshift writes the output file objects, including the manifest file if MANIFEST is specified. The object names are prefixed with *name-prefix*. If you use `PARTITION BY`, a forward slash (/) is automatically added to the end of the *name-prefix* value if needed. For added security, UNLOAD connects to Amazon S3 using an HTTPS connection. By default, UNLOAD writes one or more files per slice. UNLOAD appends a slice number and part number to the specified name prefix as follows:  
`<object-path>/<name-prefix><slice-number>_part_<part-number>`.   
If MANIFEST is specified, the manifest file is written as follows:  
`<object_path>/<name_prefix>manifest`.   
If PARALLEL is specified OFF, the data files are written as follows:  
`<object_path>/<name_prefix><part-number>`.   
UNLOAD automatically creates encrypted files using Amazon S3 server-side encryption (SSE), including the manifest file if MANIFEST is used. The COPY command automatically reads server-side encrypted files during the load operation. You can transparently download server-side encrypted files from your bucket using either the Amazon S3 console or API. For more information, see [Protecting Data Using Server-Side Encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html).   
REGION is required when the Amazon S3 bucket isn't in the same AWS Region as the Amazon Redshift database. 

*authorization*  
The UNLOAD command needs authorization to write data to Amazon S3. The UNLOAD command uses the same parameters the COPY command uses for authorization. For more information, see [Authorization parameters](copy-parameters-authorization.md) in the COPY command syntax reference.

IAM\$1ROLE \$1 default \$1 'arn:aws:iam::*<AWS account-id-1>*:role/*<role-name>*'   <a name="unload-iam"></a>
Use the default keyword to have Amazon Redshift use the IAM role that is set as default and associated with the cluster when the UNLOAD command runs.  
Use the Amazon Resource Name (ARN) for an IAM role that your cluster uses for authentication and authorization. If you specify IAM\$1ROLE, you can't use ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY, SESSION\$1TOKEN, or CREDENTIALS. The IAM\$1ROLE can be chained. For more information, see [Chaining IAM roles](https://docs.aws.amazon.com/redshift/latest/mgmt/authorizing-redshift-service.html#authorizing-redshift-service-chaining-roles) in the *Amazon Redshift Management Guide*.

[ FORMAT [AS] ] CSV \$1 PARQUET \$1 JSON  <a name="unload-csv"></a>
Keywords to specify the unload format to override the default format.   
When CSV, unloads to a text file in CSV format using a comma ( , ) character as the default delimiter. If a field contains delimiters, double quotation marks, newline characters, or carriage returns, then the field in the unloaded file is enclosed in double quotation marks. A double quotation mark within a data field is escaped by an additional double quotation mark. When zero rows are unloaded, Amazon Redshift might write empty Amazon S3 objects.  
When PARQUET, unloads to a file in Apache Parquet version 1.0 format. By default, each row group is compressed using SNAPPY compression. For more information about Apache Parquet format, see [Parquet](https://parquet.apache.org/).   
When JSON, unloads to a JSON file with each line containing a JSON object, representing a full record in the query result. Amazon Redshift supports writing nested JSON when the query result contains SUPER columns. To create a valid JSON object, the name of each column in the query must be unique. In the JSON file, boolean values are unloaded as `t` or `f`, and NULL values are unloaded as `null`. When zero rows are unloaded, Amazon Redshift does not write Amazon S3 objects.  
The FORMAT and AS keywords are optional. You can't use CSV with ESCAPE, FIXEDWIDTH, or ADDQUOTES. You can't use PARQUET with DELIMITER, FIXEDWIDTH, ADDQUOTES, ESCAPE, NULL AS, HEADER, GZIP, BZIP2, or ZSTD. PARQUET with ENCRYPTED is only supported with server-side encryption with an AWS Key Management Service key (SSE-KMS). You can't use JSON with DELIMITER, HEADER, FIXEDWIDTH, ADDQUOTES, ESCAPE, or NULL AS.

PARTITION BY ( *column\$1name* [, ... ] ) [INCLUDE]  <a name="unload-partitionby"></a>
Specifies the partition keys for the unload operation. UNLOAD automatically partitions output files into partition folders based on the partition key values, following the Apache Hive convention. For example, a Parquet file that belongs to the partition year 2019 and the month September has the following prefix: `s3://amzn-s3-demo-bucket/my_prefix/year=2019/month=September/000.parquet`.   
The value for *column\$1name* must be a column in the query results being unloaded.   
If you specify PARTITION BY with the INCLUDE option, partition columns aren't removed from the unloaded files.   
Amazon Redshift doesn't support string literals in PARTITION BY clauses.

MANIFEST [ VERBOSE ]  
Creates a manifest file that explicitly lists details for the data files that are created by the UNLOAD process. The manifest is a text file in JSON format that lists the URL of each file that was written to Amazon S3.   
If MANIFEST is specified with the VERBOSE option, the manifest includes the following details:   
+ The column names and data types, and for CHAR, VARCHAR, or NUMERIC data types, dimensions for each column. For CHAR and VARCHAR data types, the dimension is the length. For a DECIMAL or NUMERIC data type, the dimensions are precision and scale. 
+ The row count unloaded to each file. If the HEADER option is specified, the row count includes the header line. 
+ The total file size of all files unloaded and the total row count unloaded to all files. If the HEADER option is specified, the row count includes the header lines. 
+ The author. Author is always "Amazon Redshift".
You can specify VERBOSE only following MANIFEST.   
The manifest file is written to the same Amazon S3 path prefix as the unload files in the format `<object_path_prefix>manifest`. For example, if UNLOAD specifies the Amazon S3 path prefix '`s3://amzn-s3-demo-bucket/venue_`', the manifest file location is '`s3://amzn-s3-demo-bucket/venue_manifest`'.

HEADER  
Adds a header line containing column names at the top of each output file. Text transformation options, such as CSV, DELIMITER, ADDQUOTES, and ESCAPE, also apply to the header line. You can't use HEADER with FIXEDWIDTH.

DELIMITER AS '*delimiter\$1character*'   
Specifies a single ASCII character that is used to separate fields in the output file, such as a pipe character ( \$1 ), a comma ( , ), or a tab ( \$1t ). The default delimiter for text files is a pipe character. The default delimiter for CSV files is a comma character. The AS keyword is optional. You can't use DELIMITER with FIXEDWIDTH. If the data contains the delimiter character, you need to specify the ESCAPE option to escape the delimiter, or use ADDQUOTES to enclose the data in double quotation marks. Alternatively, specify a delimiter that isn't contained in the data.

FIXEDWIDTH '*fixedwidth\$1spec*'   
Unloads the data to a file where each column width is a fixed length, rather than separated by a delimiter. The *fixedwidth\$1spec* is a string that specifies the number of columns and the width of the columns. The AS keyword is optional. Because FIXEDWIDTH doesn't truncate data, the specification for each column in the UNLOAD statement needs to be at least as long as the length of the longest entry for that column. The format for *fixedwidth\$1spec* is shown below:   

```
'colID1:colWidth1,colID2:colWidth2, ...'
```
You can't use FIXEDWIDTH with DELIMITER or HEADER.

ENCRYPTED [AUTO]  <a name="unload-parameters-encrypted"></a>
Specifies that the output files on Amazon S3 are encrypted using Amazon S3 server-side encryption. If MANIFEST is specified, the manifest file is also encrypted. For more information, see [Unloading encrypted data files](t_unloading_encrypted_files.md). If you don't specify the ENCRYPTED parameter, UNLOAD automatically creates encrypted files using Amazon S3 server-side encryption with AWS-managed encryption keys (SSE-S3).   
For ENCRYPTED, you might want to unload to Amazon S3 using server-side encryption with an AWS KMS key (SSE-KMS). If so, use the [KMS_KEY_ID](#unload-parameters-kms-key-id) parameter to provide the key ID. You can't use the [Using the CREDENTIALS parameter](copy-parameters-authorization.md#copy-credentials) parameter with the KMS\$1KEY\$1ID parameter. If you run an UNLOAD command for data using KMS\$1KEY\$1ID, you can then do a COPY operation for the same data without specifying a key.   
If ENCRYPTED AUTO is used, the UNLOAD command fetches the default AWS KMS encryption key on the target Amazon S3 bucket property and encrypts the files written to Amazon S3 with the AWS KMS key. If the bucket doesn't have the default AWS KMS encryption key, UNLOAD automatically creates encrypted files using Amazon Redshift server-side encryption with AWS-managed encryption keys (SSE-S3). You can't use this option with KMS\$1KEY\$1ID, MASTER\$1SYMMETRIC\$1KEY, or CREDENTIALS that contains master\$1symmetric\$1key. 

KMS\$1KEY\$1ID '*key-id*'  <a name="unload-parameters-kms-key-id"></a>
Specifies the key ID for an AWS Key Management Service (AWS KMS) key to be used to encrypt data files on Amazon S3. For more information, see [What is AWS Key Management Service?](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html) If you specify KMS\$1KEY\$1ID, you must specify the [ENCRYPTED](#unload-parameters-encrypted) parameter also. If you specify KMS\$1KEY\$1ID, you can't authenticate using the CREDENTIALS parameter. Instead, use either [Using the IAM\$1ROLE parameter](copy-parameters-authorization.md#copy-iam-role) or [Using the ACCESS\$1KEY\$1ID and SECRET\$1ACCESS\$1KEY parameters](copy-parameters-authorization.md#copy-access-key-id). 

BZIP2   
Unloads data to one or more bzip2-compressed files per slice. Each resulting file is appended with a `.bz2` extension. 

GZIP   
Unloads data to one or more gzip-compressed files per slice. Each resulting file is appended with a `.gz` extension. 

ZSTD   
Unloads data to one or more Zstandard-compressed files per slice. Each resulting file is appended with a `.zst` extension. 

ADDQUOTES   
Places quotation marks around each unloaded data field, so that Amazon Redshift can unload data values that contain the delimiter itself. For example, if the delimiter is a comma, you could unload and reload the following data successfully:   

```
 "1","Hello, World" 
```
Without the added quotation marks, the string `Hello, World` would be parsed as two separate fields.  
Some output formats do not support ADDQUOTES.  
If you use ADDQUOTES, you must specify REMOVEQUOTES in the COPY if you reload the data.

NULL AS '*null-string*'   
Specifies a string that represents a null value in unload files. If this option is used, all output files contain the specified string in place of any null values found in the selected data. If this option isn't specified, null values are unloaded as:   
+ Zero-length strings for delimited output 
+ Whitespace strings for fixed-width output
If a null string is specified for a fixed-width unload and the width of an output column is less than the width of the null string, the following behavior occurs:   
+ An empty field is output for non-character columns 
+ An error is reported for character columns 
Unlike other data types where a user-defined string represents a null value, Amazon Redshift exports the SUPER data columns using the JSON format and represents it as null as determined by the JSON format. As a result, SUPER data columns ignore the NULL [AS] option used in UNLOAD commands.

ESCAPE   
For CHAR and VARCHAR columns in delimited unload files, an escape character (`\`) is placed before every occurrence of the following characters:  
+ Linefeed: `\n`
+ Carriage return: `\r`
+ The delimiter character specified for the unloaded data. 
+ The escape character: `\`
+ A quotation mark character: `"` or `'` (if both ESCAPE and ADDQUOTES are specified in the UNLOAD command).
If you loaded your data using a COPY with the ESCAPE option, you must also specify the ESCAPE option with your UNLOAD command to generate the reciprocal output file. Similarly, if you UNLOAD using the ESCAPE option, you need to use ESCAPE when you COPY the same data.

ALLOWOVERWRITE   <a name="allowoverwrite"></a>
By default, UNLOAD fails if it finds files that it would possibly overwrite. If ALLOWOVERWRITE is specified, UNLOAD overwrites existing files, including the manifest file. 

CLEANPATH  <a name="cleanpath"></a>
The CLEANPATH option removes existing files located in the Amazon S3 path specified in the TO clause before unloading files to the specified location.   
If you include the PARTITION BY clause, existing files are removed only from the partition folders to receive new files generated by the UNLOAD operation.  
You must have the `s3:DeleteObject` permission on the Amazon S3 bucket. For information, see [Policies and Permissions in Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-policy-language-overview.html) in the *Amazon Simple Storage Service User Guide*. Files that you remove by using the CLEANPATH option are permanently deleted and can't be recovered. If the target Amazon S3 bucket has versioning enabled, UNLOAD with the CLEANPATH option does not remove previous versions of the files.  
You can't specify the CLEANPATH option if you specify the ALLOWOVERWRITE option.

PARALLEL   <a name="unload-parallel"></a>
By default, UNLOAD writes data in parallel to multiple files, according to the number of slices in the cluster. The default option is ON or TRUE. If PARALLEL is OFF or FALSE, UNLOAD writes to one or more data files serially, sorted absolutely according to the ORDER BY clause, if one is used. The maximum size for a data file is 6.2 GB. So, for example, if you unload 13.4 GB of data, UNLOAD creates the following three files.  

```
s3://amzn-s3-demo-bucket/key000    6.2 GB
s3://amzn-s3-demo-bucket/key001    6.2 GB
s3://amzn-s3-demo-bucket/key002    1.0 GB
```
The UNLOAD command is designed to use parallel processing. We recommend leaving PARALLEL enabled for most cases, especially if the files are used to load tables using a COPY command.

MAXFILESIZE [AS] max-size [ MB \$1 GB ]   <a name="unload-maxfilesize"></a>
Specifies the maximum size of files that UNLOAD creates in Amazon S3. Specify a decimal value between 5 MB and 6.2 GB. The AS keyword is optional. The default unit is MB. If MAXFILESIZE isn't specified, the default maximum file size is 6.2 GB. The size of the manifest file, if one is used, isn't affected by MAXFILESIZE.

ROWGROUPSIZE [AS] size [ MB \$1 GB ]   <a name="unload-rowgroupsize"></a>
Specifies the size of row groups. Choosing a larger size can reduce the number of row groups, reducing the amount of network communication. Specify an integer value between 32 MB and 128 MB. The AS keyword is optional. The default unit is MB.  
If ROWGROUPSIZE isn't specified, the default size is 32 MB. To use this parameter, the storage format must be Parquet and the node type must be ra3.4xlarge, ra3.16xlarge, or dc2.8xlarge.

REGION [AS] '*aws-region*'  <a name="unload-region"></a>
Specifies the AWS Region where the target Amazon S3 bucket is located. REGION is required for UNLOAD to an Amazon S3 bucket that isn't in the same AWS Region as the Amazon Redshift database.   
The value for *aws\$1region* must match an AWS Region listed in the [Amazon Redshift regions and endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html#redshift_region) table in the *AWS General Reference*.  
By default, UNLOAD assumes that the target Amazon S3 bucket is located in the same AWS Region as the Amazon Redshift database.

EXTENSION '*extension-name*'  <a name="unload-extension"></a>
Specifies the file extension to append to the names of the unloaded files. Amazon Redshift doesn't run any validation, so you must verify that the specified file extension is correct. If you specify a compression method without providing an extension, Amazon Redshift only adds the compression method's extension to the filename. If you don't provide any extension and don't specify a compression method, Amazon Redshift doesn't add anything to the filename. 

## Usage notes
<a name="unload-usage-notes"></a>

### Using ESCAPE for all delimited text UNLOAD operations
<a name="unload-usage-escape"></a>

When you UNLOAD using a delimiter, your data can include that delimiter or any of the characters listed in the ESCAPE option description. In this case, you must use the ESCAPE option with the UNLOAD statement. If you don't use the ESCAPE option with the UNLOAD, subsequent COPY operations using the unloaded data might fail.

**Important**  
We strongly recommend that you always use ESCAPE with both UNLOAD and COPY statements. The exception is if you are certain that your data doesn't contain any delimiters or other characters that might need to be escaped. 

### Loss of floating-point precision
<a name="unload-usage-floating-point-precision"></a>

You might encounter loss of precision for floating-point data that is successively unloaded and reloaded. 

### Limit clause
<a name="unload-usage-limit-clause"></a>

The SELECT query can't use a LIMIT clause in the outer SELECT. For example, the following UNLOAD statement fails.

```
unload ('select * from venue limit 10')
to 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

Instead, use a nested LIMIT clause, as in the following example.

```
unload ('select * from venue where venueid in
(select venueid from venue order by venueid desc limit 10)')
to 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

You can also populate a table using SELECT…INTO or CREATE TABLE AS using a LIMIT clause, then unload from that table.

### Unloading a column of the GEOMETRY data type
<a name="unload-usage-geometry"></a>

You can only unload GEOMETRY columns to text or CSV format. You can't unload GEOMETRY data with the `FIXEDWIDTH` option. The data is unloaded in the hexadecimal form of the extended well-known binary (EWKB) format. If the size of the EWKB data is more than 4 MB, then a warning occurs because the data can't later be loaded into a table. 

### Unloading the HLLSKETCH data type
<a name="unload-usage-hll"></a>

You can only unload HLLSKETCH columns to text or CSV format. You can't unload HLLSKETCH data with the `FIXEDWIDTH` option. The data is unloaded in the Base64 format for dense HyperLogLog sketches or in the JSON format for sparse HyperLogLog sketches. For more information, see [HyperLogLog functions](hyperloglog-functions.md).

The following example exports a table containing HLLSKETCH columns into a file.

```
CREATE TABLE a_table(an_int INT, b_int INT);
INSERT INTO a_table VALUES (1,1), (2,1), (3,1), (4,1), (1,2), (2,2), (3,2), (4,2), (5,2), (6,2);

CREATE TABLE hll_table (sketch HLLSKETCH);
INSERT INTO hll_table select hll_create_sketch(an_int) from a_table group by b_int;

UNLOAD ('select * from hll_table') TO 's3://amzn-s3-demo-bucket/unload/'
IAM_ROLE 'arn:aws:iam::0123456789012:role/MyRedshiftRole' NULL AS 'null' ALLOWOVERWRITE CSV;
```

### Unloading a column of the VARBYTE data type
<a name="unload-usage-varbyte"></a>

You can only unload VARBYTE columns to text or CSV format. The data is unloaded in the hexadecimal form. You can't unload VARBYTE data with the `FIXEDWIDTH` option. The `ADDQUOTES` option of UNLOAD to a CSV is not supported. A VARBYTE column can't be a PARTITIONED BY column. 

### FORMAT AS PARQUET clause
<a name="unload-parquet-usage"></a>

Be aware of these considerations when using FORMAT AS PARQUET:
+ Unload to Parquet doesn't use file level compression. Each row group is compressed with SNAPPY.
+ If MAXFILESIZE isn't specified, the default maximum file size is 6.2 GB. You can use MAXFILESIZE to specify a file size of 5 MB–6.2 GB. The actual file size is approximated when the file is being written, so it might not be exactly equal to the number you specify.

  To maximize scan performance, Amazon Redshift tries to create Parquet files that contain equally sized 32-MB row groups. The MAXFILESIZE value that you specify is automatically rounded down to the nearest multiple of 32 MB. For example, if you specify MAXFILESIZE 200 MB, then each Parquet file unloaded is approximately 192 MB (32 MB row group x 6 = 192 MB).
+ If a column uses TIMESTAMPTZ data format, only the timestamp values are unloaded. The time zone information isn't unloaded.
+ Don't specify file name prefixes that begin with underscore (\$1) or period (.) characters. Redshift Spectrum treats files that begin with these characters as hidden files and ignores them.

### PARTITION BY clause
<a name="unload-partitionby-usage"></a>

Be aware of these considerations when using PARTITION BY:
+ Partition columns aren't included in the output file.
+ Make sure to include partition columns in the SELECT query used in the UNLOAD statement. You can specify any number of partition columns in the UNLOAD command. However, there is a limitation that there should be at least one nonpartition column to be part of the file.
+ If the partition key value is null, Amazon Redshift automatically unloads that data into a default partition called `partition_column=__HIVE_DEFAULT_PARTITION__`. 
+ The UNLOAD command doesn't make any calls to an external catalog. To register your new partitions to be part of your existing external table, use a separate ALTER TABLE ... ADD PARTITION ... command. Or you can run a CREATE EXTERNAL TABLE command to register the unloaded data as a new external table. You can also use an AWS Glue crawler to populate your Data Catalog. For more information, see [Defining Crawlers](https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html) in the *AWS Glue Developer Guide*. 
+ If you use the MANIFEST option, Amazon Redshift generates only one manifest file in the root Amazon S3 folder.
+ The column data types that you can use as the partition key are SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, BOOLEAN, CHAR, VARCHAR, DATE, and TIMESTAMP. 

### Using the ASSUMEROLE privilege to grant access to an IAM role for UNLOAD operations
<a name="unload-assumerole-privilege-usage"></a>

To provide access for specific users and groups to an IAM role for UNLOAD operations, a superuser can grant the ASSUMEROLE privilege on an IAM role to users and groups. For information, see [GRANT](r_GRANT.md). 

### UNLOAD doesn't support Amazon S3 access point aliases
<a name="unload-usage-s3-access-point-alias"></a>

You can't use Amazon S3 access point aliases with the UNLOAD command. 

## Examples
<a name="r_UNLOAD-examples"></a>

For examples that show how to use the UNLOAD command, see [UNLOAD examples](r_UNLOAD_command_examples.md).

# UNLOAD examples
<a name="r_UNLOAD_command_examples"></a>

These examples demonstrate various parameters of the UNLOAD command. The TICKIT sample data is used in many of the examples. For more information, see [Sample database](c_sampledb.md).

**Note**  
These examples contain line breaks for readability. Do not include line breaks or spaces in your *credentials-args* string.

## Unload VENUE to a pipe-delimited file (default delimiter)
<a name="unload-examples-venue"></a>

The following example unloads the VENUE table and writes the data to `s3://amzn-s3-demo-bucket/unload/`: 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

By default, UNLOAD writes one or more files per slice. Assuming a two-node cluster with two slices per node, the previous example creates these files in `amzn-s3-demo-bucket`:

```
unload/0000_part_00
unload/0001_part_00
unload/0002_part_00
unload/0003_part_00
```

To better differentiate the output files, you can include a prefix in the location. The following example unloads the VENUE table and writes the data to `s3://amzn-s3-demo-bucket/unload/venue_pipe_`: 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/venue_pipe_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The result is these four files in the `unload` folder, again assuming four slices.

```
venue_pipe_0000_part_00
venue_pipe_0001_part_00
venue_pipe_0002_part_00
venue_pipe_0003_part_00
```

## Unload LINEITEM table to partitioned Parquet files
<a name="unload-examples-partitioned-parquet"></a>

The following example unloads the LINEITEM table in Parquet format, partitioned by the `l_shipdate` column. 

```
unload ('select * from lineitem')
to 's3://amzn-s3-demo-bucket/lineitem/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
PARQUET
PARTITION BY (l_shipdate);
```

Assuming four slices, the resulting Parquet files are dynamically partitioned into various folders. 

```
s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-02/0000_part_00.parquet
                                             0001_part_00.parquet
                                             0002_part_00.parquet
                                             0003_part_00.parquet
s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-03/0000_part_00.parquet
                                             0001_part_00.parquet
                                             0002_part_00.parquet
                                             0003_part_00.parquet
s3://amzn-s3-demo-bucket/lineitem/l_shipdate=1992-01-04/0000_part_00.parquet
                                             0001_part_00.parquet
                                             0002_part_00.parquet
                                             0003_part_00.parquet
...
```

**Note**  
In some cases, the UNLOAD command used the INCLUDE option as shown in the following SQL statement.   

```
unload ('select * from lineitem')
to 's3://amzn-s3-demo-bucket/lineitem/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
PARQUET
PARTITION BY (l_shipdate) INCLUDE;
```
In these cases, the `l_shipdate` column is also in the data in the Parquet files. Otherwise, the `l_shipdate` column data isn't in the Parquet files.

## Unload the VENUE table to a JSON file
<a name="unload-examples-json"></a>

The following example unloads the VENUE table and writes the data in JSON format to `s3://amzn-s3-demo-bucket/unload/`.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
JSON;
```

Following are sample rows from the VENUE table.

```
venueid | venuename                  | venuecity       | venuestate | venueseats
--------+----------------------------+-----------------+------------+-----------
      1 | Pinewood Racetrack         | Akron           | OH         | 0
      2 | Columbus "Crew" Stadium    | Columbus        | OH         | 0
      4 | Community, Ballpark, Arena | Kansas City     | KS         | 0
```

After unloading to JSON, the format of the file is similar to the following.

```
{"venueid":1,"venuename":"Pinewood Racetrack","venuecity":"Akron","venuestate":"OH","venueseats":0}
{"venueid":2,"venuename":"Columbus \"Crew\" Stadium ","venuecity":"Columbus","venuestate":"OH","venueseats":0}
{"venueid":4,"venuename":"Community, Ballpark, Arena","venuecity":"Kansas City","venuestate":"KS","venueseats":0}
```

## Unload VENUE to a CSV file
<a name="unload-examples-csv"></a>

The following example unloads the VENUE table and writes the data in CSV format to `s3://amzn-s3-demo-bucket/unload/`.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
CSV;
```

Suppose that the VENUE table contains the following rows.

```
venueid | venuename                  | venuecity       | venuestate | venueseats
--------+----------------------------+-----------------+------------+-----------
      1 | Pinewood Racetrack         | Akron           | OH         | 0
      2 | Columbus "Crew" Stadium    | Columbus        | OH         | 0
      4 | Community, Ballpark, Arena | Kansas City     | KS         | 0
```

The unload file looks similar to the following.

```
1,Pinewood Racetrack,Akron,OH,0
2,"Columbus ""Crew"" Stadium",Columbus,OH,0
4,"Community, Ballpark, Arena",Kansas City,KS,0
```

## Unload VENUE to a CSV file using a delimiter
<a name="unload-examples-csv-delimiter"></a>

The following example unloads the VENUE table and writes the data in CSV format using the pipe character (\$1) as the delimiter. The unloaded file is written to `s3://amzn-s3-demo-bucket/unload/`. The VENUE table in this example contains the pipe character in the value of the first row (`Pinewood Race|track`). It does this to show that the value in the result is enclosed in double quotation marks. A double quotation mark is escaped by a double quotation mark, and the entire field is enclosed in double quotation marks. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
CSV DELIMITER AS '|';
```

Suppose that the VENUE table contains the following rows.

```
venueid | venuename                  | venuecity       | venuestate | venueseats
--------+----------------------------+-----------------+------------+-------------
      1 | Pinewood Race|track        | Akron           | OH         | 0
      2 | Columbus "Crew" Stadium    | Columbus        | OH         | 0
      4 | Community, Ballpark, Arena | Kansas City     | KS         | 0
```

The unload file looks similar to the following.

```
1|"Pinewood Race|track"|Akron|OH|0
2|"Columbus ""Crew"" Stadium"|Columbus|OH|0
4|Community, Ballpark, Arena|Kansas City|KS|0
```

## Unload VENUE with a manifest file
<a name="unload-examples-manifest"></a>

To create a manifest file, include the MANIFEST option. The following example unloads the VENUE table and writes a manifest file along with the data files to s3://amzn-s3-demo-bucket/venue\$1pipe\$1: 

**Important**  
If you unload files with the MANIFEST option, you should use the MANIFEST option with the COPY command when you load the files. If you use the same prefix to load the files and don't specify the MANIFEST option, COPY fails because it assumes the manifest file is a data file.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest;
```

The result is these five files:

```
s3://amzn-s3-demo-bucket/venue_pipe_0000_part_00
s3://amzn-s3-demo-bucket/venue_pipe_0001_part_00
s3://amzn-s3-demo-bucket/venue_pipe_0002_part_00
s3://amzn-s3-demo-bucket/venue_pipe_0003_part_00
s3://amzn-s3-demo-bucket/venue_pipe_manifest
```

The following shows the contents of the manifest file. 

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0000_part_00"},
    {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0001_part_00"},
    {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0002_part_00"},
    {"url":"s3://amzn-s3-demo-bucket/tickit/venue_0003_part_00"}
  ]
}
```

## Unload VENUE with MANIFEST VERBOSE
<a name="unload-examples-manifest-verbose"></a>

When you specify the MANIFEST VERBOSE option, the manifest file includes the following sections: 
+ The `entries` section lists Amazon S3 path, file size, and row count for each file. 
+ The `schema` section lists the column names, data types, and dimension for each column. 
+ The `meta` section shows the total file size and row count for all files. 

The following example unloads the VENUE table using the MANIFEST VERBOSE option. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload_venue_folder/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest verbose;
```

The following shows the contents of the manifest file.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0000_part_00", "meta": { "content_length": 32295, "record_count": 10 }},
    {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0001_part_00", "meta": { "content_length": 32771, "record_count": 20 }},
    {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0002_part_00", "meta": { "content_length": 32302, "record_count": 10 }},
    {"url":"s3://amzn-s3-demo-bucket/venue_pipe_0003_part_00", "meta": { "content_length": 31810, "record_count": 15 }}
  ],
  "schema": {
    "elements": [
      {"name": "venueid", "type": { "base": "integer" }},
      {"name": "venuename", "type": { "base": "character varying", 25 }},
      {"name": "venuecity", "type": { "base": "character varying", 25 }},
      {"name": "venuestate", "type": { "base": "character varying", 25 }},
      {"name": "venueseats", "type": { "base": "character varying", 25 }}
    ]
  },
  "meta": {
    "content_length": 129178,
    "record_count": 55
  },
  "author": {
    "name": "Amazon Redshift",
    "version": "1.0.0"
  }
}
```

## Unload VENUE with a header
<a name="unload-examples-header"></a>

The following example unloads VENUE with a header row.

```
unload ('select * from venue where venueseats > 75000')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
header
parallel off;
```

The following shows the contents of the output file with a header row.

```
venueid|venuename|venuecity|venuestate|venueseats
6|New York Giants Stadium|East Rutherford|NJ|80242
78|INVESCO Field|Denver|CO|76125
83|FedExField|Landover|MD|91704
79|Arrowhead Stadium|Kansas City|MO|79451
```

## Unload VENUE to smaller files
<a name="unload-examples-maxfilesize"></a>

By default, the maximum file size is 6.2 GB. If the unload data is larger than 6.2 GB, UNLOAD creates a new file for each 6.2 GB data segment. To create smaller files, include the MAXFILESIZE parameter. Assuming the size of the data in the previous example was 20 GB, the following UNLOAD command creates 20 files, each 1 GB in size.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
maxfilesize 1 gb;
```

## Unload VENUE serially
<a name="unload-examples-serial"></a>

To unload serially, specify PARALLEL OFF. UNLOAD then writes one file at a time, up to a maximum of 6.2 GB per file. 

The following example unloads the VENUE table and writes the data serially to `s3://amzn-s3-demo-bucket/unload/`. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/venue_serial_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
parallel off;
```

The result is one file named venue\$1serial\$1000. 

If the unload data is larger than 6.2 GB, UNLOAD creates a new file for each 6.2 GB data segment. The following example unloads the LINEORDER table and writes the data serially to `s3://amzn-s3-demo-bucket/unload/`. 

```
unload ('select * from lineorder')
to 's3://amzn-s3-demo-bucket/unload/lineorder_serial_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
parallel off gzip;
```

The result is the following series of files.

```
lineorder_serial_0000.gz
lineorder_serial_0001.gz
lineorder_serial_0002.gz
lineorder_serial_0003.gz
```

To better differentiate the output files, you can include a prefix in the location. The following example unloads the VENUE table and writes the data to `s3://amzn-s3-demo-bucket/venue_pipe_`: 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/unload/venue_pipe_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The result is these four files in the `unload` folder, again assuming four slices.

```
venue_pipe_0000_part_00
venue_pipe_0001_part_00
venue_pipe_0002_part_00
venue_pipe_0003_part_00
```

## Load VENUE from unload files
<a name="unload-examples-load"></a>

To load a table from a set of unload files, simply reverse the process by using a COPY command. The following example creates a new table, LOADVENUE, and loads the table from the data files created in the previous example.

```
create table loadvenue (like venue);

copy loadvenue from 's3://amzn-s3-demo-bucket/venue_pipe_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

If you used the MANIFEST option to create a manifest file with your unload files, you can load the data using the same manifest file. You do so with a COPY command with the MANIFEST option. The following example loads data using a manifest file.

```
copy loadvenue
from 's3://amzn-s3-demo-bucket/venue_pipe_manifest' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest;
```

## Unload VENUE to encrypted files
<a name="unload-examples-unload-encrypted"></a>

The following example unloads the VENUE table to a set of encrypted files using an AWS KMS key. If you specify a manifest file with the ENCRYPTED option, the manifest file is also encrypted. For more information, see [Unloading encrypted data files](t_unloading_encrypted_files.md).

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_encrypt_kms'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
kms_key_id '1234abcd-12ab-34cd-56ef-1234567890ab'
manifest
encrypted;
```

The following example unloads the VENUE table to a set of encrypted files using a root symmetric key. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_encrypt_cmk'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
master_symmetric_key 'EXAMPLEMASTERKEYtkbjk/OpCwtYSx/M4/t7DMCDIK722'
encrypted;
```

## Load VENUE from encrypted files
<a name="unload-examples-load-encrypted"></a>

To load tables from a set of files that were created by using UNLOAD with the ENCRYPT option, reverse the process by using a COPY command. With that command, use the ENCRYPTED option and specify the same root symmetric key that was used for the UNLOAD command. The following example loads the LOADVENUE table from the encrypted data files created in the previous example.

```
create table loadvenue (like venue);

copy loadvenue
from 's3://amzn-s3-demo-bucket/venue_encrypt_manifest'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
master_symmetric_key 'EXAMPLEMASTERKEYtkbjk/OpCwtYSx/M4/t7DMCDIK722'
manifest
encrypted;
```

## Unload VENUE data to a tab-delimited file
<a name="unload-examples-venue-tab"></a>

```
unload ('select venueid, venuename, venueseats from venue')
to 's3://amzn-s3-demo-bucket/venue_tab_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter as '\t';
```

The output data files look like this: 

```
1	Toyota Park	Bridgeview	IL	0
2	Columbus Crew Stadium	Columbus	OH	0
3	RFK Stadium	Washington	DC	0
4	CommunityAmerica Ballpark	Kansas City	KS	0
5	Gillette Stadium	Foxborough	MA	68756
...
```

## Unload VENUE to a fixed-width data file
<a name="unload-venue-fixed-width"></a>

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_fw_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
fixedwidth as 'venueid:3,venuename:39,venuecity:16,venuestate:2,venueseats:6';
```

The output data files look like the following. 

```
1  Toyota Park              Bridgeview  IL0
2  Columbus Crew Stadium    Columbus    OH0
3  RFK Stadium              Washington  DC0
4  CommunityAmerica BallparkKansas City KS0
5  Gillette Stadium         Foxborough  MA68756
...
```

## Unload VENUE to a set of tab-delimited GZIP-compressed files
<a name="unload-examples-venue-gzip"></a>

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_tab_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter as '\t'
gzip;
```

## Unload VENUE to a GZIP-compressed text file
<a name="unload-examples-venue-extension-gzip"></a>

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_tab_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
extension 'txt.gz'
gzip;
```

## Unload data that contains a delimiter
<a name="unload-examples-delimiter"></a>

This example uses the ADDQUOTES option to unload comma-delimited data where some of the actual data fields contain a comma.

First, create a table that contains quotation marks.

```
create table location (id int, location char(64));

insert into location values (1,'Phoenix, AZ'),(2,'San Diego, CA'),(3,'Chicago, IL');
```

Then, unload the data using the ADDQUOTES option.

```
unload ('select id, location from location')
to 's3://amzn-s3-demo-bucket/location_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
delimiter ',' addquotes;
```

The unloaded data files look like this: 

```
1,"Phoenix, AZ"
2,"San Diego, CA"
3,"Chicago, IL"
...
```

## Unload the results of a join query
<a name="unload-examples-join"></a>

The following example unloads the results of a join query that contains a window function. 

```
unload ('select venuecity, venuestate, caldate, pricepaid,
sum(pricepaid) over(partition by venuecity, venuestate
order by caldate rows between 3 preceding and 3 following) as winsum
from sales join date on sales.dateid=date.dateid
join event on event.eventid=sales.eventid
join venue on event.venueid=venue.venueid
order by 1,2')
to 's3://amzn-s3-demo-bucket/tickit/winsum'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

The output files look like this: 

```
Atlanta|GA|2008-01-04|363.00|1362.00
Atlanta|GA|2008-01-05|233.00|2030.00
Atlanta|GA|2008-01-06|310.00|3135.00
Atlanta|GA|2008-01-08|166.00|8338.00
Atlanta|GA|2008-01-11|268.00|7630.00
...
```

## Unload using NULL AS
<a name="unload-examples-null-as"></a>

UNLOAD outputs null values as empty strings by default. The following examples show how to use NULL AS to substitute a text string for nulls.

For these examples, we add some null values to the VENUE table.

```
update venue set venuestate = NULL
where venuecity = 'Cleveland';
```

Select from VENUE where VENUESTATE is null to verify that the columns contain NULL.

```
select * from venue where venuestate is null;

 venueid |        venuename         | venuecity | venuestate | venueseats
---------+--------------------------+-----------+------------+------------
      22 | Quicken Loans Arena      | Cleveland |            |          0
     101 | Progressive Field        | Cleveland |            |      43345
      72 | Cleveland Browns Stadium | Cleveland |            |      73200
```

Now, UNLOAD the VENUE table using the NULL AS option to replace null values with the character string '`fred`'. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
null as 'fred';
```

The following sample from the unload file shows that null values were replaced with `fred`. It turns out that some values for VENUESEATS were also null and were replaced with `fred`. Even though the data type for VENUESEATS is integer, UNLOAD converts the values to text in the unload files, and then COPY converts them back to integer. If you are unloading to a fixed-width file, the NULL AS string must not be larger than the field width.

```
248|Charles Playhouse|Boston|MA|0
251|Paris Hotel|Las Vegas|NV|fred
258|Tropicana Hotel|Las Vegas|NV|fred
300|Kennedy Center Opera House|Washington|DC|0
306|Lyric Opera House|Baltimore|MD|0
308|Metropolitan Opera|New York City|NY|0
  5|Gillette Stadium|Foxborough|MA|5
 22|Quicken Loans Arena|Cleveland|fred|0
101|Progressive Field|Cleveland|fred|43345
...
```

To load a table from the unload files, use a COPY command with the same NULL AS option. 

**Note**  
If you attempt to load nulls into a column defined as NOT NULL, the COPY command fails.

```
create table loadvenuenulls (like venue);

copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
null as 'fred';
```

To verify that the columns contain null, not just empty strings, select from LOADVENUENULLS and filter for null.

```
select * from loadvenuenulls where venuestate is null or venueseats is null;

 venueid |        venuename         | venuecity | venuestate | venueseats
---------+--------------------------+-----------+------------+------------
      72 | Cleveland Browns Stadium | Cleveland |            |      73200
     253 | Mirage Hotel             | Las Vegas | NV         |
     255 | Venetian Hotel           | Las Vegas | NV         |
      22 | Quicken Loans Arena      | Cleveland |            |          0
     101 | Progressive Field        | Cleveland |            |      43345
     251 | Paris Hotel              | Las Vegas | NV         |

...
```

You can UNLOAD a table that contains nulls using the default NULL AS behavior and then COPY the data back into a table using the default NULL AS behavior; however, any non-numeric fields in the target table contain empty strings, not nulls. By default UNLOAD converts nulls to empty strings (white space or zero-length). COPY converts empty strings to NULL for numeric columns, but inserts empty strings into non-numeric columns. The following example shows how to perform an UNLOAD followed by a COPY using the default NULL AS behavior. 

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' allowoverwrite;

truncate loadvenuenulls;
copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
```

In this case, when you filter for nulls, only the rows where VENUESEATS contained nulls. Where VENUESTATE contained nulls in the table (VENUE), VENUESTATE in the target table (LOADVENUENULLS) contains empty strings.

```
select * from loadvenuenulls where venuestate is null or venueseats is null;

 venueid |        venuename         | venuecity | venuestate | venueseats
---------+--------------------------+-----------+------------+------------
     253 | Mirage Hotel             | Las Vegas | NV         |
     255 | Venetian Hotel           | Las Vegas | NV         |
     251 | Paris Hotel              | Las Vegas | NV         |
...
```

To load empty strings to non-numeric columns as NULL, include the EMPTYASNULL or BLANKSASNULL options. It's OK to use both.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' allowoverwrite;

truncate loadvenuenulls;
copy loadvenuenulls from 's3://amzn-s3-demo-bucket/nulls/'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' EMPTYASNULL;
```

To verify that the columns contain NULL, not just whitespace or empty strings, select from LOADVENUENULLS and filter for null.

```
select * from loadvenuenulls where venuestate is null or venueseats is null;

 venueid |        venuename         | venuecity | venuestate | venueseats
---------+--------------------------+-----------+------------+------------
      72 | Cleveland Browns Stadium | Cleveland |            |      73200
     253 | Mirage Hotel             | Las Vegas | NV         |
     255 | Venetian Hotel           | Las Vegas | NV         |
      22 | Quicken Loans Arena      | Cleveland |            |          0
     101 | Progressive Field        | Cleveland |            |      43345
     251 | Paris Hotel              | Las Vegas | NV         |
     ...
```

## Unload using ALLOWOVERWRITE parameter
<a name="unload-examples-allowoverwrite"></a>

By default, UNLOAD doesn't overwrite existing files in the destination bucket. For example, if you run the same UNLOAD statement twice without modifying the files in the destination bucket, the second UNLOAD fails. To overwrite the existing files, including the manifest file, specify the ALLOWOVERWRITE option.

```
unload ('select * from venue')
to 's3://amzn-s3-demo-bucket/venue_pipe_'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole'
manifest allowoverwrite;
```

## Unload EVENT table using PARALLEL and MANIFEST parameters
<a name="unload-examples-manifest-parallel"></a>

You can UNLOAD a table in parallel and generate a manifest file. The Amazon S3 data files are all created at the same level and names are suffixed with the pattern `0000_part_00`. The manifest file is at the same folder level as the data files and suffixed with the text `manifest`. The following SQL unloads the EVENT table and creates files with the base name `parallel`

```
unload ('select * from mytickit1.event')
to 's3://amzn-s3-demo-bucket/parallel'
iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole'
parallel on
manifest;
```

The Amazon S3 files listing is similar to the following.

```
 Name                       Last modified                        Size                  
 parallel0000_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 52.1 KB  
 parallel0001_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 53.4 KB
 parallel0002_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 52.1 KB
 parallel0003_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 51.1 KB
 parallel0004_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 54.6 KB
 parallel0005_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 53.4 KB
 parallel0006_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 54.1 KB
 parallel0007_part_00	-   August 2, 2023, 14:54:39 (UTC-07:00) 55.9 KB
 parallelmanifest       -   August 2, 2023, 14:54:39 (UTC-07:00) 886.0 B
```

The `parallelmanifest` file content is similar to the following.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/parallel0000_part_00", "meta": { "content_length": 53316 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0001_part_00", "meta": { "content_length": 54704 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0002_part_00", "meta": { "content_length": 53326 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0003_part_00", "meta": { "content_length": 52356 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0004_part_00", "meta": { "content_length": 55933 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0005_part_00", "meta": { "content_length": 54648 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0006_part_00", "meta": { "content_length": 55436 }},
    {"url":"s3://amzn-s3-demo-bucket/parallel0007_part_00", "meta": { "content_length": 57272 }}
  ]
}
```

## Unload EVENT table using PARALLEL OFF and MANIFEST parameters
<a name="unload-examples-manifest-serial"></a>

You can UNLOAD a table serially (PARALLEL OFF) and generate a manifest file. The Amazon S3 data files are all created at the same level and names are suffixed with the pattern `0000`. The manifest file is at the same folder level as the data files and suffixed with the text `manifest`.

```
unload ('select * from mytickit1.event')
to 's3://amzn-s3-demo-bucket/serial'
iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole'
parallel off
manifest;
```

The Amazon S3 files listing is similar to the following.

```
 Name                       Last modified                        Size                  
 serial0000             -   August 2, 2023, 15:54:39 (UTC-07:00) 426.7 KB  
 serialmanifest         -   August 2, 2023, 15:54:39 (UTC-07:00) 120.0 B
```

The `serialmanifest` file content is similar to the following.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/serial000", "meta": { "content_length": 436991 }}
  ]
}
```

## Unload EVENT table using PARTITION BY and MANIFEST parameters
<a name="unload-examples-manifest-partition"></a>

You can UNLOAD a table by partition and generate a manifest file. A new folder is created in Amazon S3 with child partition folders, and the data files in the child folders with a name pattern similar to `0000_par_00`. The manifest file is at the same folder level as the child folders with the name `manifest`.

```
unload ('select * from mytickit1.event')
to 's3://amzn-s3-demo-bucket/partition'
iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole'
partition by (eventname)
manifest;
```

The Amazon S3 files listing is similar to the following.

```
 Name                   Type     Last modified                        Size                  
 partition           	Folder
```

In the `partition` folder are child folders with the partition name and the manifest file. Shown following is the bottom of the list of folders in the `partition` folder, similar to the following.

```
 Name                   Type      Last modified                        Size                  
 ...
 eventname=Zucchero/    Folder 
 eventname=Zumanity/    Folder 
 eventname=ZZ Top/      Folder  
 manifest          	    -	    August 2, 2023, 15:54:39 (UTC-07:00) 467.6 KB
```

In the `eventname=Zucchero/` folder are the data files similar to the following.

```
 Name               Last modified                        Size                  
 0000_part_00	-   August 2, 2023, 15:59:19 (UTC-07:00) 70.0 B
 0001_part_00	-   August 2, 2023, 15:59:16 (UTC-07:00) 106.0 B
 0002_part_00	-   August 2, 2023, 15:59:15 (UTC-07:00) 70.0 B
 0004_part_00	-   August 2, 2023, 15:59:17 (UTC-07:00) 141.0 B
 0006_part_00	-   August 2, 2023, 15:59:16 (UTC-07:00) 35.0 B
 0007_part_00	-   August 2, 2023, 15:59:19 (UTC-07:00) 108.0 B
```

The bottom of the `manifest` file content is similar to the following.

```
{
  "entries": [
    ...
    {"url":"s3://amzn-s3-demo-bucket/partition/eventname=Zucchero/007_part_00", "meta": { "content_length": 108 }},
    {"url":"s3://amzn-s3-demo-bucket/partition/eventname=Zumanity/007_part_00", "meta": { "content_length": 72 }}
  ]
}
```

## Unload EVENT table using MAXFILESIZE, ROWGROUPSIZE, and MANIFEST parameters
<a name="unload-examples-manifest-maxsize"></a>

You can UNLOAD a table in parallel and generate a manifest file. The Amazon S3 data files are all created at the same level and names are suffixed with the pattern `0000_part_00`. The generated Parquet data files are limited to 256 MB and row group size 128 MB. The manifest file is at the same folder level as the data files and suffixed with `manifest`.

```
unload ('select * from mytickit1.event')
to 's3://amzn-s3-demo-bucket/eventsize'
iam_role 'arn:aws:iam::123456789012:role/MyRedshiftRole'
maxfilesize 256 MB
rowgroupsize 128 MB
parallel on
parquet
manifest;
```

The Amazon S3 files listing is similar to the following.

```
 Name                            Type      Last modified                        Size 
 eventsize0000_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 24.5 KB
 eventsize0001_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 24.8 KB
 eventsize0002_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 24.4 KB
 eventsize0003_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 24.0 KB
 eventsize0004_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 25.3 KB
 eventsize0005_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 24.8 KB
 eventsize0006_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 25.0 KB
 eventsize0007_part_00.parquet	parquet	August 2, 2023, 17:35:21 (UTC-07:00) 25.6 KB
 eventsizemanifest                 -       August 2, 2023, 17:35:21 (UTC-07:00) 958.0 B
```

The `eventsizemanifest` file content is similar to the following.

```
{
  "entries": [
    {"url":"s3://amzn-s3-demo-bucket/eventsize0000_part_00.parquet", "meta": { "content_length": 25130 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0001_part_00.parquet", "meta": { "content_length": 25428 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0002_part_00.parquet", "meta": { "content_length": 25025 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0003_part_00.parquet", "meta": { "content_length": 24554 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0004_part_00.parquet", "meta": { "content_length": 25918 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0005_part_00.parquet", "meta": { "content_length": 25362 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0006_part_00.parquet", "meta": { "content_length": 25647 }},
    {"url":"s3://amzn-s3-demo-bucket/eventsize0007_part_00.parquet", "meta": { "content_length": 26256 }}
  ]
}
```

# UPDATE
<a name="r_UPDATE"></a>

**Topics**
+ [Syntax](#r_UPDATE-synopsis)
+ [Parameters](#r_UPDATE-parameters)
+ [Usage notes](#r_UPDATE_usage_notes)
+ [Examples of UPDATE statements](c_Examples_of_UPDATE_statements.md)

Updates values in one or more table columns when a condition is satisfied. 

**Note**  
The maximum size for a single SQL statement is 16 MB.

## Syntax
<a name="r_UPDATE-synopsis"></a>

```
[ WITH [RECURSIVE] common_table_expression [, common_table_expression , ...] ]
            UPDATE table_name [ [ AS ] alias ] SET column = { expression | DEFAULT } [,...]

[ FROM fromlist ]
[ WHERE condition ]
```

## Parameters
<a name="r_UPDATE-parameters"></a>

WITH clause  
Optional clause that specifies one or more *common-table-expressions*. See [WITH clause](r_WITH_clause.md). 

 *table\$1name*   
A temporary or persistent table. Only the owner of the table or a user with UPDATE privilege on the table may update rows. If you use the FROM clause or select from tables in an expression or condition, you must have SELECT privilege on those tables. You can't give the table an alias here; however, you can specify an alias in the FROM clause.   
Amazon Redshift Spectrum external tables are read-only. You can't UPDATE an external table.

alias  
Temporary alternative name for a target table. Aliases are optional. The AS keyword is always optional. 

SET *column* =   
One or more columns that you want to modify. Columns that aren't listed retain their current values. Do not include the table name in the specification of a target column. For example, `UPDATE tab SET tab.col = 1` is invalid.

 *expression*   
An expression that defines the new value for the specified column. 

DEFAULT   
Updates the column with the default value that was assigned to the column in the CREATE TABLE statement. 

FROM *tablelist*   
You can update a table by referencing information in other tables. List these other tables in the FROM clause or use a subquery as part of the WHERE condition. Tables listed in the FROM clause can have aliases. If you need to include the target table of the UPDATE statement in the list, use an alias. 

WHERE *condition*   
Optional clause that restricts updates to rows that match a condition. When the condition returns `true`, the specified SET columns are updated. The condition can be a simple predicate on a column or a condition based on the result of a subquery.   
You can name any table in the subquery, including the target table for the UPDATE. 

## Usage notes
<a name="r_UPDATE_usage_notes"></a>

After updating a large number of rows in a table: 
+ Vacuum the table to reclaim storage space and re-sort rows. 
+ Analyze the table to update statistics for the query planner. 

Left, right, and full outer joins aren't supported in the FROM clause of an UPDATE statement; they return the following error: 

```
ERROR: Target table must be part of an equijoin predicate
```

 If you need to specify an outer join, use a subquery in the WHERE clause of the UPDATE statement. 

If your UPDATE statement requires a self-join to the target table, you need to specify the join condition, as well as the WHERE clause criteria that qualify rows for the update operation. In general, when the target table is joined to itself or another table, a best practice is to use a subquery that clearly separates the join conditions from the criteria that qualify rows for updates. 

UPDATE queries with multiple matches per row throw an error when the configuration parameter `error_on_nondeterministic_update` is set to *true*. For more information, see [error\$1on\$1nondeterministic\$1update](r_error_on_nondeterministic_update.md).

You can update a GENERATED BY DEFAULT AS IDENTITY column. Columns defined as GENERATED BY DEFAULT AS IDENTITY can be updated with values you supply. For more information, see [GENERATED BY DEFAULT AS IDENTITY](r_CREATE_TABLE_NEW.md#identity-generated-bydefault-clause). 

# Examples of UPDATE statements
<a name="c_Examples_of_UPDATE_statements"></a>

For more information about the tables used in the following examples, see [Sample database](c_sampledb.md).

The CATEGORY table in the TICKIT database contains the following rows: 

```
+-------+----------+-----------+--------------------------------------------+
| catid | catgroup |  catname  |                  catdesc                   |
+-------+----------+-----------+--------------------------------------------+
| 5     | Sports   | MLS       | Major League Soccer                        |
| 11    | Concerts | Classical | All symphony, concerto, and choir concerts |
| 1     | Sports   | MLB       | Major League Baseball                      |
| 6     | Shows    | Musicals  | Musical theatre                            |
| 3     | Sports   | NFL       | National Football League                   |
| 8     | Shows    | Opera     | All opera and light opera                  |
| 2     | Sports   | NHL       | National Hockey League                     |
| 9     | Concerts | Pop       | All rock and pop music concerts            |
| 4     | Sports   | NBA       | National Basketball Association            |
| 7     | Shows    | Plays     | All non-musical theatre                    |
| 10    | Concerts | Jazz      | All jazz singers and bands                 |
+-------+----------+-----------+--------------------------------------------+
```

 **Updating a table based on a range of values** 

Update the CATGROUP column based on a range of values in the CATID column. 

```
UPDATE category
SET catgroup='Theatre'
WHERE catid BETWEEN 6 AND 8;

SELECT * FROM category
WHERE catid BETWEEN 6 AND 8;

+-------+----------+----------+---------------------------+
| catid | catgroup | catname  |          catdesc          |
+-------+----------+----------+---------------------------+
| 6     | Theatre  | Musicals | Musical theatre           |
| 7     | Theatre  | Plays    | All non-musical theatre   |
| 8     | Theatre  | Opera    | All opera and light opera |
+-------+----------+----------+---------------------------+
```

 **Updating a table based on a current value** 

Update the CATNAME and CATDESC columns based on their current CATGROUP value: 

```
UPDATE category
SET catdesc=default, catname='Shows'
WHERE catgroup='Theatre';

SELECT * FROM category
WHERE catname='Shows';

+-------+----------+---------+---------+
| catid | catgroup | catname | catdesc |
+-------+----------+---------+---------+
| 6     | Theatre  | Shows   | NULL    |
| 7     | Theatre  | Shows   | NULL    |
| 8     | Theatre  | Shows   | NULL    |
+-------+----------+---------+---------+)
```

In this case, the CATDESC column was set to null because no default value was defined when the table was created.

Run the following commands to set the CATEGORY table data back to the original values:

```
TRUNCATE category;

COPY category
FROM 's3://redshift-downloads/tickit/category_pipe.txt' 
DELIMITER '|' 
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;
```

 **Updating a table based on the result of a WHERE clause subquery** 

Update the CATEGORY table based on the result of a subquery in the WHERE clause: 

```
UPDATE category
SET catdesc='Broadway Musical'
WHERE category.catid IN
(SELECT category.catid FROM category
JOIN event ON category.catid = event.catid
JOIN venue ON venue.venueid = event.venueid
JOIN sales ON sales.eventid = event.eventid
WHERE venuecity='New York City' AND catname='Musicals');
```

View the updated table: 

```
SELECT * FROM category ORDER BY catid;

+-------+----------+-----------+--------------------------------------------+
| catid | catgroup |  catname  |                  catdesc                   |
+-------+----------+-----------+--------------------------------------------+
| 2     | Sports   | NHL       | National Hockey League                     |
| 3     | Sports   | NFL       | National Football League                   |
| 4     | Sports   | NBA       | National Basketball Association            |
| 5     | Sports   | MLS       | Major League Soccer                        |
| 6     | Shows    | Musicals  | Broadway Musical                           |
| 7     | Shows    | Plays     | All non-musical theatre                    |
| 8     | Shows    | Opera     | All opera and light opera                  |
| 9     | Concerts | Pop       | All rock and pop music concerts            |
| 10    | Concerts | Jazz      | All jazz singers and bands                 |
| 11    | Concerts | Classical | All symphony, concerto, and choir concerts |
+-------+----------+-----------+--------------------------------------------+
```

 **Updating a table based on the result of a WITH clause subquery** 

To update the CATEGORY table based on the result of a subquery using the WITH clause, use the following example.

```
WITH u1 as (SELECT catid FROM event ORDER BY catid DESC LIMIT 1) 
UPDATE category SET catid='200' FROM u1 WHERE u1.catid=category.catid;

SELECT * FROM category ORDER BY catid DESC LIMIT 1;

+-------+----------+---------+---------------------------------+
| catid | catgroup | catname |             catdesc             |
+-------+----------+---------+---------------------------------+
| 200   | Concerts | Pop     | All rock and pop music concerts |
+-------+----------+---------+---------------------------------+
```

## Updating a table based on the result of a join condition
<a name="c_Examples_of_UPDATE_statements-updating-a-table-based-on-the-result-of-a-join-condition"></a>

Update the original 11 rows in the CATEGORY table based on matching CATID rows in the EVENT table: 

```
UPDATE category SET catid=100
FROM event
WHERE event.catid=category.catid;

SELECT * FROM category ORDER BY catid;

+-------+----------+-----------+--------------------------------------------+
| catid | catgroup |  catname  |                  catdesc                   |
+-------+----------+-----------+--------------------------------------------+
| 2     | Sports   | NHL       | National Hockey League                     |
| 3     | Sports   | NFL       | National Football League                   |
| 4     | Sports   | NBA       | National Basketball Association            |
| 5     | Sports   | MLS       | Major League Soccer                        |
| 10    | Concerts | Jazz      | All jazz singers and bands                 |
| 11    | Concerts | Classical | All symphony, concerto, and choir concerts |
| 100   | Concerts | Pop       | All rock and pop music concerts            |
| 100   | Shows    | Plays     | All non-musical theatre                    |
| 100   | Shows    | Opera     | All opera and light opera                  |
| 100   | Shows    | Musicals  | Broadway Musical                           |
+-------+----------+-----------+--------------------------------------------+
```

 Note that the EVENT table is listed in the FROM clause and the join condition to the target table is defined in the WHERE clause. Only four rows qualified for the update. These four rows are the rows whose CATID values were originally 6, 7, 8, and 9; only those four categories are represented in the EVENT table: 

```
SELECT DISTINCT catid FROM event;

+-------+
| catid |
+-------+
| 6     |
| 7     |
| 8     |
| 9     |
+-------+
```

Update the original 11 rows in the CATEGORY table by extending the previous example and adding another condition to the WHERE clause. Because of the restriction on the CATGROUP column, only one row qualifies for the update (although four rows qualify for the join). 

```
UPDATE category SET catid=100
FROM event
WHERE event.catid=category.catid
AND catgroup='Concerts';

SELECT * FROM category WHERE catid=100;

+-------+----------+---------+---------------------------------+
| catid | catgroup | catname |             catdesc             |
+-------+----------+---------+---------------------------------+
| 100   | Concerts | Pop     | All rock and pop music concerts |
+-------+----------+---------+---------------------------------+
```

An alternative way to write this example is as follows: 

```
UPDATE category SET catid=100
FROM event JOIN category cat ON event.catid=cat.catid
WHERE cat.catgroup='Concerts';
```

The advantage to this approach is that the join criteria are clearly separated from any other criteria that qualify rows for the update. Note the use of the alias CAT for the CATEGORY table in the FROM clause.

## Updates with outer joins in the FROM clause
<a name="c_Examples_of_UPDATE_statements-updates-with-outer-joins-in-the-from-clause"></a>

The previous example showed an inner join specified in the FROM clause of an UPDATE statement. The following example returns an error because the FROM clause does not support outer joins to the target table: 

```
UPDATE category SET catid=100
FROM event LEFT JOIN category cat ON event.catid=cat.catid
WHERE cat.catgroup='Concerts';
ERROR:  Target table must be part of an equijoin predicate
```

If the outer join is required for the UPDATE statement, you can move the outer join syntax into a subquery: 

```
UPDATE category SET catid=100
FROM
(SELECT event.catid FROM event LEFT JOIN category cat ON event.catid=cat.catid) eventcat
WHERE category.catid=eventcat.catid
AND catgroup='Concerts';
```

## Updates with columns from another table in the SET clause
<a name="c_Examples_of_UPDATE_statements-set-with-column-from-another-table"></a>

To update the listing table in the TICKIT sample database with values from the sales table, use the following example.

```
SELECT listid, numtickets FROM listing WHERE sellerid = 1 ORDER BY 1 ASC LIMIT 5;

+--------+------------+
| listid | numtickets |
+--------+------------+
| 100423 | 4          |
| 108334 | 24         |
| 117150 | 4          |
| 135915 | 20         |
| 205927 | 6          |
+--------+------------+

UPDATE listing
SET numtickets = sales.sellerid
FROM sales
WHERE sales.sellerid = 1 AND listing.sellerid = sales.sellerid;

SELECT listid, numtickets FROM listing WHERE sellerid = 1 ORDER BY 1 ASC LIMIT 5;

+--------+------------+
| listid | numtickets |
+--------+------------+
| 100423 | 1          |
| 108334 | 1          |
| 117150 | 1          |
| 135915 | 1          |
| 205927 | 1          |
+--------+------------+
```

# USE
<a name="r_USE_command"></a>

Changes the database on which queries run. SHOW USE points to the database that most recently is used with the USE command. RESET USE resets the used database. This means that if the database is not specified in the SQL, the objects are searched in the current database.

## Syntax
<a name="r_USE-synopsis"></a>

```
USE database
```

## Examples
<a name="r_USE_command-examples"></a>

Suppose there are three databases, `dev` and `pdb`, and `pdb2`. Let there be two tables `t` in the public schemas of each of the databases. First, insert data into tables across different databases:

```
dev=# insert into dev.public.t values (1);
INSERT 0 1
dev=# insert into pdb.public.t values (2);
INSERT 0 1
```

Without explicitly setting a database, the system uses your connected database. Check your current database context:

```
dev=# show use;
Use Database

(1 row)
dev=> show search_path;
search_path
$user, public
(1 row)
```

When querying table `t` without specifying a database, the system uses the table in your current database:

```
dev=# select * from t;
c
----
1
(1 row)
```

Use the `use` command to switch databases without changing your connection:

```
dev=# use pdb;
USE
dev=# show use;
 Use Database
--------------
 pdb
(1 row)
dev=# select * from t;
id
----
2
(1 row)
```

You can also explicitly specify the schema:

```
dev=# select * from public.t;
id
----
2
(1 row)
```

You can now create tables in different schemas within your current database:

```
dev=# create table s1.t(id int);
CREATE TABLE
dev=# insert into pdb.s1.t values (3);
INSERT 0 1
```

The search path determines which schema's objects are accessed when you don't specify a schema:

```
dev=# set search_path to public, s1;
SET
dev=# select * from t;
 id
----
  2
(1 row)
```

Change the order of schemas to access different tables:

```
dev=# set search_path to s1, public;
SET
dev=# show search_path;
 search_path
-------------
 s1, public
(1 row)
dev=# select * from t;
 id
----
  3
(1 row)
```

Switch to another database while maintaining your original connection:

```
dev=# show use;
 Use Database
--------------
 pdb
(1 row)
dev=# use pdb2;
USE
dev=# show use;
 Use Database
--------------
 pdb2
(1 row)
```

When switching databases, the search path resets to default:

```
dev=# show search_path;
  search_path
---------------
 $user, public
(1 row)
```

Create a table and insert data in your current database:

```
dev=# create table pdb2.public.t(id int);
CREATE TABLE
dev=# insert into pdb2.public.t values (4);
INSERT 0 1
dev=# select * from t;
 id
----
  4
(1 row)
```

In transactions, you can write to the current database and read from any database using three-part notation. This also includes the connected database:

```
dev=# show use;
 Use Database
--------------
 pdb2
(1 row)

dev=# BEGIN;
BEGIN
dev=# select * from t;
 id
----
  4
(1 row)

dev=# insert into t values (5);
INSERT 0 1
dev=# select * from t;
 id
----
  4
  5
(2 rows)

dev=# select * from pdb.public.t;
 id
----
  2
(1 row)

dev=# select * from dev.public.t;
 id
----
  1
(1 row)
```

Reset to your connected database. Note that this does not only revert to the previously used database `pdb`, but resets to the connected database. The search path also changes to the default one: 

```
dev=# RESET USE;
RESET
dev=# select * from t;
c
----
1
(1 row)
dev=# show use;
 Use Database
--------------

(1 row)

dev=# show search_path;
  search_path
---------------
 $user, public
(1 row)
```

You can change databases at the start of a transaction, but not after running queries:

```
dev=# BEGIN;
BEGIN
dev=# use pdb;
USE
dev=# use pdb2;
USE
dev=# use pdb;
USE
dev=# select * from t;
 id
----
  2
(1 row)
dev=# use pdb2;
ERROR:  USEd Database cannot be set or reset inside a transaction after another command.
dev=# rollback;
ROLLBACK
(1 row)
```

### Data Catalog example
<a name="use-redlake-example"></a>

First, create tables in different schemas and catalogs to demonstrate cross-catalog queries. Start by creating tables in the connected database.

```
dev=# CREATE TABLE dev.public.t (col INT);
dev=# INSERT INTO dev.public.t VALUES (1);
dev=# CREATE SCHEMA write_schema;
dev=# CREATE TABLE dev.write_schema.t (state char (2));
dev=# INSERT INTO dev.write_schema.t VALUES ('WA');
```

Now, create similar tables in a different catalog. This demonstrates how to work with cross-catalog databases.

```
dev=# CREATE TABLE my_db@my_catalog.public.t (col INT);
dev=# INSERT INTO my_db@my_catalog.public.t VALUES (100);
dev=# CREATE SCHEMA my_db@my_catalog.write_schema;
dev=# CREATE TABLE my_db@my_catalog.write_schema.t (state char (2));
dev=# INSERT INTO my_db@my_catalog.write_schema.t VALUES ('CA');
```

Check the current database context. Without explicitly setting a database, the system uses the connected database.

```
dev=# SHOW USE;
 Use Database
--------------

(1 row)

dev=# SHOW search_path;
  search_path
---------------
 $user, public
(1 row)

dev=# SELECT * FROM t;
 col
-----
   1
(1 row)
```

Set the USEd database to query the tables in a different catalog.

```
dev=# USE my_db@my_catalog;

dev=# SHOW USE;
            Use Database
-------------------------------------
 my_db@my_catalog
(1 row)

dev=# SHOW search_path;
  search_path
---------------
 $user, public
(1 row)
```

When querying table t, results come from the cross-catalog database.

```
dev=# SELECT * FROM t;
 col
-----
 100
(1 row)

dev=# SELECT * FROM public.t;
 col
-----
 100
(1 row)

dev=# SELECT * FROM my_db@my_catalog.public.t;
 col
-----
 100
(1 row)
```

Change the search path to access tables in different schemas within the USEd database.

```
dev=# SET search_path to write_schema;

dev=# SHOW search_path;
 search_path
--------------
 write_schema
(1 row)

dev=# SELECT * FROM t;
 state
-------
 CA
(1 row)

dev=# SELECT * FROM write_schema.t;
 state
-------
 CA
(1 row)

dev=# SELECT * FROM my_db@my_catalog.write_schema.t;
 state
-------
 CA
(1 row)
```

Even though USE is set to a cross-catalog database, it's still possible to explicitly query the original database.

```
dev=# SELECT * FROM dev.write_schema.t;
 state
-------
 WA
(1 row)
```

Reset the USEd database to again refer to objects in the connected database.

```
dev=# RESET USE;

dev=# SHOW USE;
 Use Database
--------------

(1 row)
```

Note that the search\$1path gets reset when USE is reset.

```
dev=# SHOW search_path;
  search_path
---------------
 $user, public
(1 row)
```

After resetting, queries now refer to the original connected database.

```
dev=# SELECT * FROM t;
 col
-----
   1
(1 row)

dev=# SELECT * FROM public.t;
 col
-----
   1
(1 row)

dev=# SELECT * FROM dev.public.t;
 col
-----
   1
(1 row)
```

You can modify the search path in the original database to access different schemas.

```
dev=# SET search_path to write_schema;

dev=# SHOW search_path;
 search_path
--------------
 write_schema
(1 row)

dev=# SELECT * FROM t;
 state
-------
 WA
(1 row)

dev=# SELECT * FROM write_schema.t;
 state
-------
 WA
(1 row)

dev=# SELECT * FROM dev.write_schema.t;
 state
-------
 WA
(1 row)
```

# VACUUM
<a name="r_VACUUM_command"></a>

Re-sorts rows and reclaims space in either a specified table or all tables in the current database.

**Note**  
Only users with the necessary table permissions can effectively vacuum a table. If VACUUM is run without the necessary table permissions, the operation completes successfully but has no effect. For a list of valid table permissions to effectively run VACUUM, see the following Required privileges section.

Amazon Redshift automatically sorts data and runs VACUUM DELETE in the background. This lessens the need to run the VACUUM command. For more information, see [Vacuuming tables](t_Reclaiming_storage_space202.md). 

By default, VACUUM skips the sort phase for any table where more than 95 percent of the table's rows are already sorted. Skipping the sort phase can significantly improve VACUUM performance. To change the default sort or delete threshold for a single table, include the table name and the TO *threshold* PERCENT parameter when you run VACUUM. 

Users can access tables while they are being vacuumed. You can perform queries and write operations while a table is being vacuumed, but when data manipulation language (DML) commands and a vacuum run concurrently, both might take longer. If you run UPDATE and DELETE statements during a vacuum, system performance might be reduced. VACUUM DELETE temporarily blocks update and delete operations. 

Amazon Redshift automatically performs a DELETE ONLY vacuum in the background. Automatic vacuum operation pauses when users run data definition language (DDL) operations, such as ALTER TABLE.

**Note**  
The Amazon Redshift VACUUM command syntax and behavior are substantially different from the PostgreSQL VACUUM operation. For example, the default VACUUM operation in Amazon Redshift is VACUUM FULL, which reclaims disk space and re-sorts all rows. In contrast, the default VACUUM operation in PostgreSQL simply reclaims space and makes it available for reuse.

For more information, see [Vacuuming tables](t_Reclaiming_storage_space202.md).

## Required privileges
<a name="r_VACUUM_command-privileges"></a>

Following are required privileges for VACUUM:
+ Superuser
+ Users with the VACUUM privilege
+ Table owner
+ Database owner whom the table is shared to

## Syntax
<a name="r_VACUUM_command-synopsis"></a>

```
VACUUM [ FULL | SORT ONLY | DELETE ONLY | REINDEX | RECLUSTER ]
[ [ table_name ] [ TO threshold PERCENT ] [ BOOST ] ]
```

## Parameters
<a name="r_VACUUM_command-parameters"></a>

FULL   <a name="vacuum-full"></a>
Sorts the specified table (or all tables in the current database) and reclaims disk space occupied by rows that were marked for deletion by previous UPDATE and DELETE operations. VACUUM FULL is the default.  
A full vacuum doesn't perform a reindex for interleaved tables. To reindex interleaved tables followed by a full vacuum, use the [VACUUM REINDEX](#vacuum-reindex) option.   
By default, VACUUM FULL skips the sort phase for any table that is already at least 95 percent sorted. If VACUUM is able to skip the sort phase, it performs a DELETE ONLY and reclaims space in the delete phase such that at least 95 percent of the remaining rows aren't marked for deletion.    
If the sort threshold isn't met (for example, if 90 percent of rows are sorted) and VACUUM performs a full sort, then it also performs a complete delete operation, recovering space from 100 percent of deleted rows.   
You can change the default vacuum threshold only for a single table. To change the default vacuum threshold for a single table, include the table name and the TO *threshold* PERCENT parameter. 

SORT ONLY   <a name="vacuum-sort-only"></a>
Sorts the specified table (or all tables in the current database) without reclaiming space freed by deleted rows. This option is useful when reclaiming disk space isn't important but re-sorting new rows is important. A SORT ONLY vacuum reduces the elapsed time for vacuum operations when the unsorted region doesn't contain a large number of deleted rows and doesn't span the entire sorted region. Applications that don't have disk space constraints but do depend on query optimizations associated with keeping table rows sorted can benefit from this kind of vacuum.  
By default, VACUUM SORT ONLY skips any table that is already at least 95 percent sorted. To change the default sort threshold for a single table, include the table name and the TO *threshold* PERCENT parameter when you run VACUUM. 

DELETE ONLY   <a name="vacuum-delete-only"></a>
Amazon Redshift automatically performs a DELETE ONLY vacuum in the background, so you rarely, if ever, need to run a DELETE ONLY vacuum.  
A VACUUM DELETE reclaims disk space occupied by rows that were marked for deletion by previous UPDATE and DELETE operations, and compacts the table to free up the consumed space. A DELETE ONLY vacuum operation doesn't sort table data.   
This option reduces the elapsed time for vacuum operations when reclaiming disk space is important but re-sorting new rows isn't important. This option can also be useful when your query performance is already optimal, and re-sorting rows to optimize query performance isn't a requirement.  
By default, VACUUM DELETE ONLY reclaims space such that at least 95 percent of the remaining rows aren't marked for deletion. To change the default delete threshold for a single table, include the table name and the TO *threshold* PERCENT parameter when you run VACUUM.    
Some operations, such as `ALTER TABLE APPEND`, can cause tables to be fragmented. When you use the `DELETE ONLY` clause the vacuum operation reclaims space from fragmented tables. The same threshold value of 95 percent applies to the defragmentation operation. 

REINDEX  <a name="vacuum-reindex"></a>
Analyzes the distribution of the values in interleaved sort key columns, then performs a full VACUUM operation. If REINDEX is used, a table name is required.  
VACUUM REINDEX takes significantly longer than VACUUM FULL because it makes an additional pass to analyze the interleaved sort keys. The sort and merge operation can take longer for interleaved tables because the interleaved sort might need to rearrange more rows than a compound sort.  
If a VACUUM REINDEX operation terminates before it completes, the next VACUUM resumes the reindex operation before performing the full vacuum operation.  
VACUUM REINDEX isn't supported with TO *threshold* PERCENT.  

RECLUSTER  <a name="vacuum-recluster"></a>
Sorts the portions of the table that are unsorted. Portions of the table that are already sorted by automatic table sort are left intact. This command doesn't merge the newly sorted data with the sorted region. It also doesn't reclaim all space that is marked for deletion. When this command completes, the table might not appear fully sorted, as indicated by the `unsorted` field in SVV\$1TABLE\$1INFO.   
 We recommend that you use VACUUM RECLUSTER for large tables with frequent ingestion and queries that access only the most recent data.   
 VACUUM RECLUSTER isn't supported with TO threshold PERCENT. If RECLUSTER is used, a table name is required.  
VACUUM RECLUSTER isn't supported on tables with interleaved sort keys and tables with ALL distribution style.

 *table\$1name*   
The name of a table to vacuum. If you don't specify a table name, the vacuum operation applies to all tables in the current database. You can specify any permanent or temporary user-created table. The command isn't meaningful for other objects, such as views and system tables.  
 If you include the TO *threshold* PERCENT parameter, a table name is required.

 TO *threshold* PERCENT   
A clause that specifies the threshold above which VACUUM skips the sort phase and the target threshold for reclaiming space in the delete phase. The *sort threshold* is the percentage of total rows that are already in sort order for the specified table prior to vacuuming.  The *delete threshold* is the minimum percentage of total rows not marked for deletion after vacuuming.   
Because VACUUM re-sorts the rows only when the percent of sorted rows in a table is less than the sort threshold, Amazon Redshift can often reduce VACUUM times significantly. Similarly, when VACUUM isn't constrained to reclaim space from 100 percent of rows marked for deletion, it is often able to skip rewriting blocks that contain only a few deleted rows.  
For example, if you specify 75 for *threshold*, VACUUM skips the sort phase if 75 percent or more of the table's rows are already in sort order. For the delete phase, VACUUMS sets a target of reclaiming disk space such that at least 75 percent of the table's rows aren't marked for deletion following the vacuum. The *threshold* value must be an integer between 0 and 100. The default is 95. If you specify a value of 100, VACUUM always sorts the table unless it's already fully sorted and reclaims space from all rows marked for deletion. If you specify a value of 0, VACUUM never sorts the table and never reclaims space.  
If you include the TO *threshold* PERCENT parameter, you must also specify a table name. If a table name is omitted, VACUUM fails.   
You can't use the TO *threshold* PERCENT parameter with REINDEX. 

BOOST  
Runs the VACUUM command with additional resources, such as memory and disk space, as they're available. With the BOOST option, VACUUM operates in one window and blocks concurrent deletes and updates for the duration of the VACUUM operation. Running with the BOOST option contends for system resources, which might affect query performance. Run the VACUUM BOOST when the load on the system is light, such as during maintenance operations.   
Consider the following when using the BOOST option:  
+ When BOOST is specified, the *table\$1name* value is required. 
+ BOOST isn't supported with REINDEX. 
+ BOOST is ignored with DELETE ONLY. 

## Usage notes
<a name="r_VACUUM_usage_notes"></a>

For most Amazon Redshift applications, a full vacuum is recommended. For more information, see [Vacuuming tables](t_Reclaiming_storage_space202.md).

Before running a vacuum operation, note the following behavior: 
+ You can't run VACUUM within a transaction block (BEGIN ... END). For more information about transactions, see [Isolation levels in Amazon Redshift](c_serial_isolation.md). 
+ Some amount of table growth might occur when tables are vacuumed. This behavior is expected when there are no deleted rows to reclaim or the new sort order of the table results in a lower ratio of data compression.
+ During vacuum operations, some degree of query performance degradation is expected. Normal performance resumes as soon as the vacuum operation is complete.
+ Concurrent write operations proceed during vacuum operations, but we don’t recommend performing write operations while vacuuming. It's more efficient to complete write operations before running the vacuum. Also, any data that is written after a vacuum operation has been started can't be vacuumed by that operation. In this case, a second vacuum operation is necessary.
+ A vacuum operation might not be able to start if a load or insert operation is already in progress. Vacuum operations temporarily require exclusive access to tables in order to start. This exclusive access is required briefly, so vacuum operations don't block concurrent loads and inserts for any significant period of time.
+ Vacuum operations are skipped when there is no work to do for a particular table; however, there is some overhead associated with discovering that the operation can be skipped. If you know that a table is pristine or doesn't meet the vacuum threshold, don't run a vacuum operation against it.
+ A DELETE ONLY vacuum operation on a small table might not reduce the number of blocks used to store the data, especially when the table has a large number of columns or the cluster uses a large number of slices per node. These vacuum operations add one block per column per slice to account for concurrent inserts into the table, and there is potential for this overhead to outweigh the reduction in block count from the reclaimed disk space. For example, if a 10-column table on an 8-node cluster occupies 1000 blocks before a vacuum, the vacuum doesn't reduce the actual block count unless more than 80 blocks of disk space are reclaimed because of deleted rows. (Each data block uses 1 MB.)

Automatic vacuum operations pause if any of the following conditions are met: 
+ A user runs a data definition language (DDL) operation, such as ALTER TABLE, that requires an exclusive lock on a table that automatic vacuum is currently working on. 
+ A period of high cluster load.

### Support for concurrent VACUUM
<a name="r_VACUUM_usage_notes_concurrent"></a>

Amazon Redshift supports running multiple vacuum transactions concurrently across different sessions in a cluster or workgroup. This means you can issue different and multiple instances of all vacuum modes at once, with each vacuum transaction on a unique table. Two vacuum operations can't work on a single table at the same time.

**Guidelines for running concurrent vacuum**
+ When running concurrent vacuum transactions across different sessions, you should monitor system resources, and avoid running too many vacuum operations concurrently.
+ The recommended concurrency level depends on the amount of space to be reclaimed, both the number and width of the rows to be sorted, the size of the warehouse, and the size of your workload running alongside the VACUUM operations.
+ Depending on the mode of the vacuum transaction, start with two concurrent vacuum operations, and add more depending on their run time and system load. Just like other heavy queries issued by users, vacuum operations may begin queueing if you run too many concurrently when Amazon Redshift reaches system resource limits. 
+ Run multiple Vacuum BOOST operations carefully. Running Vacuum with the BOOST option contends for system resources, which might affect query performance. Run the VACUUM BOOST when the load on the system is light, such as during maintenance operations.
+ If you don't specify a table name, the vacuum operation applies to all tables in the current database. These vacuum opearations still run sequentially.

## Examples
<a name="r_VACUUM_command-examples"></a>

Reclaim space and database and re-sort rows in all tables based on the default 95 percent vacuum threshold.

```
vacuum;
```

Reclaim space and re-sort rows in the SALES table based on the default 95 percent threshold. 

```
vacuum sales;
```

Always reclaim space and re-sort rows in the SALES table. 

```
vacuum sales to 100 percent;
```

Re-sort rows in the SALES table only if fewer than 75 percent of rows are already sorted. 

```
 vacuum sort only sales to 75 percent;
```

Reclaim space in the SALES table such that at least 75 percent of the remaining rows aren't marked for deletion following the vacuum. 

```
vacuum delete only sales to 75 percent;
```

Reindex and then vacuum the LISTING table. 

```
vacuum reindex listing;
```

The following command returns an error. 

```
vacuum reindex listing to 75 percent;
```

Recluster and then vacuum the LISTING table. 

```
vacuum recluster listing;
```

Recluster and then vacuum the LISTING table with the BOOST option. 

```
vacuum recluster listing boost;
```