-
Notifications
You must be signed in to change notification settings - Fork 0
Recurrent Neural Networks
RNNs are a good architecture for working with sequential data in the form of x (1), ...x (τ). RNNs are better suited for sequences with many members, and varied lengths than the perceptron or feed-forward layers. This is because they are able to use the relationship between adjacent members of a sequence to make their predictions. Due to their ability to retain information about previous states and process large sequences, RNNs a good option for time-series analysis and natural language processing.
RNNs process sequential data by defining a recurrence relation over timesteps which is typically the following formula:
Where
The final output of the network at a certain timestep is typically computed from one or more states.
This structure allows us to predict the next state
Source for information and image: How to implement an RNN (1/2) - Minimal example
RNNs are great for tasks that require a many-to-one, or many-to-many mapping. For example, a "character-level RNN" uses the individual characters in a given string as its input, rather than words or sentences. The network learns the underlying patterns that govern which character is allowed next to which one in a given language. In a many-to-one architecture, the RNN generates only one output, which could be a class the string belongs to.
In many-to-many architectures, the network receives an input sequence of characters and generates an output sequence of characters. For example mapping a string in a source language, to a similar one in a target language.
Source: Going Under the Hood of Character-Level RNNs: A NumPy-based Implementation Guide
We will transition into this Python notebook for our demonstration: pytorch_char_rnn_classification_tutorial.ipynb
UArizona DataLab, Data Science Institute, 2025