【英语】问题准备
个人相关¶
一些常见的问题
自我介绍¶
介绍一个做过的项目¶
介绍家乡¶
优势¶
为什么想要再南大读研¶
课程介绍¶
算法设计与分析¶
One of my favorite courses is Algorithm Design and Analysis[əˈnæləsɪs]. This course has significantly deepened my understanding of fundamental algorithms and their analysis. In the class, we started with sorting problems to introduce the divide-and-conquer strategy and heap data structures. As the course progressed, we delved[delvd] into more advanced data structures like disjoint sets, red-black trees, and hash tables. Later, we combined these concepts with graph theory to explain common algorithms such as minimum spanning trees. The course also covered greedy algorithms and dynamic programming, providing a comprehensive overview of essential algorithmic strategies.
In class, this theoretical[ˌθiəˈretɪk (ə) l] foundation helped strudents understand various algorithms. We also solved problems on online judges, which is helpful to improve coding and problem-solving skills. For me personally, This course inspired my interest in algorithms and taught me a lot.
离散数学¶
Discrete[dɪˈskriːt] Mathematics is an important course that covers many topics needed for computer science. The course starts with logic and reasoning, where we learn about basic logic, how to proofs. Then, we study set theory, and learning about functions and relations. We also learn about combinatorics[ˌkɒmbɪnəˈtɒrɪks], which is the study of counting methods, like how to count different combinations. Another important part is graph theory, where we study different types of graphs, such as trees and connected graphs. The course includes number theory, covers basic number operations and properties. We also learn about algebraic[ˌældʒɪˈbreɪɪk] structures, like groups and rings. Additionally, there is a part on probability, where we learn basic probability concepts. This course provides a strong foundation for further studies in algorithms, data structures, and theoretical computer science.
编译原理¶
Compiler Principles is a course that explains how a compiler works to translate high-level programming languages into machine code. The process starts with lexical analysis, where the source code is read and converted into tokens. Next is syntax analysis, where the tokens are checked for correct grammatical[ɡrəˈmætɪkl] structure, forming a syntax tree. After that, we have semantic analysis, which ensures the syntax tree follows the rules of the language,and create a symbol table. Then comes intermediate code generation, where the syntax tree is translated into an intermediate representation that is easier to optimize. Intermediate code optimization is the next step, where the code is improved to run more efficiently[ɪˈfɪʃ(ə)ntli]. Finally, there is target code generation, where the optimized intermediate code is translated into machine code that the computer can execute. This course helps students understand the detailed process of how programming languages are compiled and prepares them for advanced studies in computer science.
计算机系统基础¶
Computer Systems Fundamentals is a course that covers how computer systems work. It starts with machine representation of numbers, explaining how computers represent and process different types of data. Then, it moves on to instructions, where we learn about the basic commands that a computer's CPU can execute. The course also covers linking, which is the process of combining various pieces of code and data into a single executable program. We study the memory hierarchy[ˈhaɪərɑːki], including caches, main memory, and storage, to understand how data is managed efficiently. Exceptions are another key topic, where we learn about interrupts and how the system handles unexpected events. Lastly, the course covers input/output (I/O) operations, teaching us how computers interact with external devices. This course provides a comprehensive understanding of the inner workings of computer systems, preparing students for more advanced topics in computer science.
操作系统¶
Operating Systems is a course that teaches how an operating system manages computer hardware and software resources. It starts with processes and threads, explaining how to handles multiple tasks at once through concurrent[kənˈkʌrənt] programming. We learn about the differences between processes and threads and how they share resources. The course also covers virtualization, focusing on memory allocation. This includes how the OS uses techniques like paging and segmentation[ˌseɡmenˈteɪʃn] to manage memory efficiently and ensure each process has its own virtual memory space. Lastly, the course deals with persistence, which involves managing files and I/O operations. We study how the OS organizes, stores data on disk, ensuring data is stored permanently and efficiently accessed when needed. This course provides a solid understanding of the core functions of an operating system, preparing students for advanced topics in computer science and system design.
数据结构¶
Data Structures is a course that covers the essential ways to organize and store data . It begins with linear data structures such as lists, strings, arrays, stacks, and queues, explaining how to use and implement them for various applications. The course then explores trees, including binary trees and balanced trees, and graphs, covering concepts like shortest paths. We also learn about different sorting algorithms, such as quicksort and mergesort,. Another key topic is hash tables, which provide fast data search by using hash functions. Lastly, we study the file system structures like B-trees and B+ trees, which are used in databases. This course provides a comprehensive understanding of how to efficiently manage data, which is fundamental for advanced computer science studies.
程序设计基础¶
Introduction to Programming is a course that teaches the basics of programming using the C++ language. It covers fundamental programming concepts such as variables, data types, and control structures like loops and conditionals. Students learn how to write simple programs to solve problems, and as the course progresses, they are introduced to more advanced topics like functions, arrays, and pointers. This course provides a strong foundation in programming, enabling students to develop problem-solving skills.
算法与数据结构¶
专题:时间空间复杂度¶
- 时间复杂度Time Complexity
- 空间复杂度Space Complexity
二叉搜索树¶
A Binary Search Tree is a type of binary tree. The key rule is that the left child has a smaller value, and the right child has a larger value.
1. Searching start at the root, compare the value, and move left or right accordingly until find the value or reach a leaf.
2. Inserting works the same way, and place the new value where find a suitable spot. (at leaves)
3. Deletion involves replacing the node with its in-order successor[səkˈsesər] if it has two children.
1. Find the In-Order Successor: The in-order successor of N
is the node with the smallest key greater than N
. To find the in-order successor, move to the right child of N
and then keep moving to the left child until reach a node with no left child. This node is the in-order successor, let's call it S
.
2. Replace N
with S
: Replace the key and value of N
with the key and value of S
.
3. Delete the Original S
Node: Since S
is always in the right subtree of N
and S
has no left child, the deletion of S
can be handled using a simpler case of the deletion process. S
may have a right child or no children at all. If S
has a right child, replace S
with its right child. If S
has no children, simply remove S
.
快速排序¶
Quicksort is a fast sorting algorithm. 1. First selecting a "pivot[ˈpɪvət]" element from the array and then rearranging[ˌriəˈreɪndʒ] the other elements into two groups: those less than the pivot and those greater than the pivot. 2. The pivot is then in its final position. This process is repeated for the two groups, recursively sorting the sub-arrays.
冒泡排序¶
Bubble Sort is a simple sorting algorithm. 1. It repeatedly stepping through the list, comparing each pair of adjacent[əˈdʒeɪs(ə)nt] items, and swapping them if they are in the wrong order. 2. Then start again from the beginning and repeat the process until no swaps are needed, meaning the list is sorted.
归并排序¶
Merge Sort is a sorting algorithm that uses a divide and conquer approach . 1. First, divide the list into smaller sublists until each sublist has only one item. 2. Then, it repeatedly merges these sublists to produce new sorted sublists until there is only one sorted list remaining. 3. The main steps are to split the list into halves, sort each half, and then merge the sorted halves together.
希尔排序¶
Shell Sort is an advanced version of insertion sort that improves by comparing and sorting elements that are far apart. By starting with larger gaps, Shell Sort moves elements closer to their final position faster,. 1. It starts by sorting elements with a large gap between them, then progressively[prəʊˈɡresɪvli] reduces the gap and sorts the elements again. 2. This process continues until the gap is 1, at which point it performs a final insertion sort.
选择排序¶
Selection Sort works by repeatedly selecting the smallest element from the unsorted part and swapping it with the first unsorted element. 1. Start with the first element as the current minimum. 2. Scan the remaining unsorted portion of the list to find the smallest element. 3. Swap the smallest element found with the first unsorted element. 4. Then move to next element and repeat
插入排序¶
Insertion Sort repeatedly picking the next element and inserting it into its correct position among the previously sorted elements. 1. Start with the second element, considering the first element as a sorted sublist. 2. Compare the current element with the elements in the sorted sublist. 3. Shift all elements in the sorted sublist that are greater than the current element one position to the right. 4. Insert the current element into its correct position. 5. Move to the next element and repeat
拓扑排序¶
Topological sorting is an algorithm used to order the vertices of a directed acyclic[ˌeɪ'saɪklɪk] graph (DAG) in a linear sequence. It is useful for resolving dependencies, such as task scheduling.
The algorithm can be implemented by using Breadth-First Searching. 1. Compute the in-degree for each vertex. 2. Place all vertices with an in-degree of 0 into queue. 3. Repeatedly remove a vertex from the queue, add it to the topological order, and reduce the in-degree of its neighboring vertices by 1. 4. If any neighbor’s in-degree becomes 0, add it to the queue. 5. Continue until the queue is empty. The result is a list of vertices in topological order.
二分搜索¶
Binary search is an efficient algorithm for finding an item in a sorted list. It works by repeatedly dividing the search interval in half. The steps are as follows: 1. Middle Element: Find the middle element of the current interval. 2. Comparison: Compare the middle element with the target value. - If the middle element is equal to the target, the search is complete. - If the middle element is less than the target, narrow the search to the upper half of the interval. - If the middle element is greater than the target, narrow the search to the lower half of the interval. 3. Repeat: Repeat the process on the new interval until the target is found or the interval is empty.
kruskal¶
Kruskal's algorithm is a method used to find the minimum spanning tree. The MST is a subset of the graph's edges that connects all vertices with the minimum possible total edge weight and no cycles. 1. Sort all edges in the graph by their weight in ascending order. 2. Iterate through the sorted edges, adding each edge to the MST if it doesn't form a cycle with the edges already in the MST. 3. Continue this process until the MST contains exactly 𝑛−1 edges
prim¶
Prim's algorithm is a method used to find the minimum spanning tree. 1. Start with a single vertex and consider it as part of the MST. 2. Find the smallest edge connecting a vertex in the MST to a vertex outside the MST. 3. Add this edge and the vertex to the MST. 4. Repeat until all vertices are included in
Dijkstra¶
Dijkstra's algorithm is used to find the shortest path from a single source vertex to all other vertices. 1. Initialize the distance to the source vertex as 0 and to all other vertices as infinity.Set the source vertex as the current vertex and mark it as visited. 2. For the current vertex, update the distances to its adjacent vertices. If a shorter path is found, update the distance. 3. Select the unvisited vertex with the smallest distance as the new current vertex. 4. Repeat until all vertices are visited or the smallest distance among the unvisited vertices is infinity.
bellman-ford¶
The Bellman-Ford algorithm is used to find the shortest paths from a single source vertex to all other vertices. It can handle graphs with negative edge weights. 1. Initialize the distance to the source vertex as 0 and the distance to all other vertices as infinity. 2. repeat the following process (|V| - 1) times:For each edge (u, v) with weight w, if the distance to u plus w is less than the distance to v, update the distance to v. 3. After |V| - 1 iterations, check for negative-weight cycles[ˈsaɪklz] by iterating through all edges again. If a shorter path is found, it indicates the presence of a negative-weight cycle.
floyd¶
The Floyd-Warshall algorithm is used to find the shortest paths between all pairs of vertices. 1. Create a distance matrix[ˈmeɪtrɪks] to store the shortest distance between every pair of vertices, initializing it with the edge weights from the graph. 2. Iterate through each vertex as an intermediate point and update the distance matrix. For each pair of vertices (i, j), check if a path through an intermediate vertex k offers a shorter path. If so, update the distance matrix with this new shorter path. 3. Repeat the process for all vertices as the intermediate vertex.
dfs¶
Depth-First Search starts at a given node and explores as far as possible along each branch before backtracking. 1. Start at the root node 2. Visit the current node and mark it as visited. 3. Recursively visit each unvisited adjacent node, going as deep as possible along each path. 4. Backtrack when no unvisited adjacent nodes are left, and continue to the next unvisited node.
bfs¶
Breadth-First Search (BFS) starts at a given node and explores all of its neighboring nodes at the present depth before moving on to nodes at the next depth level. 1. Start at the root node 2. Visit the current node and mark it as visited. 3. Add all unvisited adjacent nodes to a queue. 4. Dequeue a node from the front of the queue, visit it, and mark it as visited. 5. Repeat until the queue is empty.
堆¶
A heap is a special tree-based data structure.There are two types of heaps: min-heaps and max-heaps. In a min-heap, the value of each node is greater than the value of its parent. In a max-heap, the situation is exactly the opposite. 1. Insertion: Add a new element to the heap and maintain the heap property by "bubbling up" the element to its correct position. 2. Deletion: Remove the root element and maintain the heap property by "bubbling down" the last element to its correct position.
并查集¶
Union-Find, is a data structure that keeps track of a partition of a set into subsets. It is used to efficiently manage and merge sets. 1. Find: Determine which subset a particular element is in. 2. Union: Merge two subsets into a single subset. To improve the efficiency of these operations, two techniques are commonly used: 1. Path Compression: During the Find operation, make each node point directly to the root, speeding up future operations. 2. Union by Rank: Always attach the smaller tree to the root of the larger tree to keep the tree as flat as possible.
红黑树¶
A Red-Black Tree is a type of self-balancing binary search tree which can ensure that the tree remains balanced, providing efficient operations.
1. Each node is either red or black.
2. The root node is always black.
3. All leaves are black.
4. Red nodes cannot have red children
5. Every path from a given node to its descendant[dɪˈsendənt] leaves has the same number of black nodes.
These properties ensure that the longest path from the root to a leaf is no more than twice as long as the shortest path, guaranteeing that the tree remains approximately balanced.
AVL 树¶
An AVL Tree is a type of self-balancing binary search tree. It maintains balance by ensuring that the height difference between the left and right subtrees of any node is at most one. To maintain this balance, AVL trees perform rotations during insertions and deletions: 1. Left Rotation: Applied when a node becomes unbalanced to the right. 2. Right Rotation: Applied when a node becomes unbalanced to the left. 3. Left-Right Rotation: Applied when a left child becomes unbalanced to the right. 4. Right-Left Rotation: Applied when a right child becomes unbalanced to the left. These rotations help to restore the balance of the tree, ensuring that the height remains logarithmic[ˌlɒɡə'rɪðmɪk] relative to the number of nodes.
哈希表¶
A hash table is a data structure that provides fast insertion, deletion, and lookup operations by using a hash function to map keys,allows for quick access to the data. When two keys produce the same hash index, a collision[kəˈlɪʒ(ə)n] occurs. There are two primary methods to handle collisions: 1. chaining: each bucket points to a linked list that share the same hash index, and new entries are simply added to the list when collisions occur. 2. Open addressing: on the other hand, stores all entries directly in the array. When a collision occurs in open addressing, the algorithm searches for the next empty slot using methods such as linear probing[ˈproʊbɪŋ].
kmp¶
The KMP algorithm efficiently finds a "pattern" string within a "text" string by avoiding unnecessary comparisons. It does this in two steps: 1. Construct a match table which is called as next array for the pattern, indicating the longest proper prefix which is also a suffix. 2. Use this table to skip unnecessary comparisons during the search, resuming the match from the appropriate position in the pattern. KMP achieves linear time complexity for the search phase.
动态规划¶
Dynamic Programming (DP) is a method for solving complex problems by breaking them down into simpler subproblems. 1. Identify the subproblems and solve each one only once. 2. Store the results of subproblems in a table to avoid redundant[rɪˈdʌndənt] calculations. 3. Use these stored results to build up solutions to larger problems. Two way to implement: - Top-Down (Memoization): Solve the problem recursively and store the results of subproblems. - Bottom-Up (Tabulation): Solve the smallest subproblems first and use their results to iteratively solve larger subproblems. Dynamic Programming is efficient for problems with overlapping subproblems and optimal substructure.
贪心¶
A greedy algorithm is a problem-solving method that makes a series of choices, each of which looks the best at the moment. The goal is to find the overall optimal solution. 1. Start with an empty solution. 2. At each step, add the best current choice to the solution. 3. Repeat until the solution is complete or all choices have been considered. Greedy algorithms are used when a problem has the "greedy-choice property," meaning a locally optimal choice will lead to a globally optimal solution.
问题复杂度¶
P refers to the class of problems that can be solved in polynomial[ˌpɒliˈnəʊmiəl] time. NP is the class of problems for which a given solution can be verified quickly in polynomial[ˌpɒliˈnəʊmiəl] time. NP-Complete refers to a subset of NP problems that are the hardest among them. It has two key properties: 1. NP: The problem is in NP, meaning a solution can be verified quickly. 2. NP-Hard: Every problem in NP can be transformed into this problem in polynomial time.
线性表¶
Linear List A linear list is a collection of elements where each element has a unique predecessor[ˈpriːdəsesə(r)] and successor, except for the first and last elements. 1. Array (Sequential[sɪˈkwenʃ(ə)l] List):An array stores elements in contiguous[kənˈtɪɡjuəs] memory locations, allowing fast access via index but has a fixed size and costly insertions/deletions. 2. Linked List:A linked list consists of nodes, each containing data and a reference to the next node. It allows dynamic sizing and efficient insertions/deletions but has slower access and memory overhead due to pointers. Restricted Linear Lists Restricted linear lists like stacks and queues impose additional rules on how elements can be added or removed, making them useful for specific situations such as function call management and task scheduling. 1. Stack: A stack is a linear list with Last-In-First-Out (LIFO) access, where elements are added and removed from the top. 2. Queue: A queue is a linear list with First-In-First-Out (FIFO) access, where elements are added at the back and removed from the front.
计算机网络¶
TCP/IP 架构¶
Computer Networks is a course talk about technologies used to connect computers and other devices to share resources and information. We mainly studied the TCP/IP five-layer protocol. 1. Physical Layer: Handles the physical connection between devices. 2. Data Link Layer: Manages node-to-node data transfer and error detection. 3. Network Layer: Uses IP to address and route packets between devices. 4. Transport Layer: Uses TCP to ensure reliable data transfer and correct packet sequencing. 5. Session Layer: Controls and manages the conversations between computers, such as logging in and out, maintaining connections. 6. Presentation Layer: Makes sure data is in the right format for the application. It translates, encrypts[ɪnˈkrɪpt], and compresses data. 7. Application Layer: Supports application services and end-user processes, like HTTP, FTP, etc.
TCP 与 UDP¶
TCP is a connection-oriented protocol[ˈprəʊtəkɒl] that ensures reliable and ordered delivery of data. It establishes a connection before data transfer and checks for errors. UDP is a connectionless protocol that allows for fast and efficient data transmission without guaranteeing delivery, order, or error checking.
操作系统¶
死锁¶
Deadlock is a situation where two or more processes cannot continue because each one is waiting for the other to release a resource. This happens when four conditions occur at the same time: 1. Each pocket has one ball. To continue, a person must get the ball. 2. The person with the ball always wants more balls. 3. They cannot take the ball from someone else who is holding it. 4. This creates a circular wait condition.
互斥锁¶
A mutex, is a synchronization primitive used in concurrent programming to prevent multiple threads or processes from accessing a shared resource together. - Lock: When a thread wants to access a shared resource, it must first acquire the mutex. If the mutex is already held by another thread, the requesting thread is blocked until the mutex is released. - Unlock: Once the thread finishes using the shared resource, it releases the mutex, allowing other waiting threads to acquire it and access the resource.
racecondition¶
A race condition is a situation in concurrent programming where the behavior of software depends on the relative timing of multiple threads or processes. This can lead to unpredictable results and bugs that are hard to reproduce and debug.
信号量¶
A semaphore[ˈseməˌfɔr] is a synchronization tool used to manage access to shared resources in concurrent programming. There are two main types of semaphores: 1. Binary Semaphore: Also known as a mutex, it has two values, 0 and 1, and ensures exclusion, allowing only one to access the resource at a time. 2. Counting Semaphore: It can have a non-negative integer value and allows multiple processes or threads to access. Semaphores use two main operations: - Wait (P operation): Decrements the semaphore value, blocking the process or thread if the value is negative. - Signal (V operation): Increments the semaphore value, unblocking a waiting process or thread if any.
进程与线程¶
- Process: A process is an independent program that has its own memory space. It includes the program code, its own data, and other resources like file handles. Processes are isolated[ˈaɪsəleɪtɪd] from each other, and communication between them typically requires inter-process communication (IPC) mechanisms[ˈmekəˌnɪzəm].
- Thread: A thread is the smallest unit of execution within a process. Threads within the same process share the same memory space and resources but can run independently. This makes threads more lightweight and efficient for parallel tasks compared to processes.
编译原理¶
编译器的基本结构¶
- Lexical[ˈleksɪk(ə)l] Analysis:The source code is converted into a sequence of tokens.
- Syntax Analysis:
- The tokens are analyzed['ænəlaizd] according to the grammatical[ɡrəˈmætɪkl] rules.
- A parser builds a syntax tree that represents the grammatical structure of the source code.
- Semantic[sɪˈmæntɪk] Analysis:
- The syntax tree is checked for semantic[sɪˈmæntɪk] errors to ensure the code comply with language's rules.
- Symbol tables are constructed and the syntax tree is marked with type information.
- Intermediate Code Generation:The syntax tree is translated into an intermediate representation[ˌreprɪzenˈteɪʃ(ə)n] that is easier to optimize and translate into machine code.
- Intermediate Code Optimization:
- The intermediate code is optimized to improve performance and reduce resource usage.
- Optimizations can include removing useless code, and optimizing loops.
- Target Code Generation:
- The optimized intermediate code is translated into machine code.
- This stage involves allocating registers, selecting instructions, and generating the final executable code. The compiler is divided into two main parts: the front end and the back end.
- Front End: The front end includes lexical analysis, syntax analysis, and semantic analysis. It reads the source code, converts it into tokens, checks for syntax and semantic errors, and generates an intermediate representation (IR). This phase ensures the code is correctly written and follows the language's rules.
- Back End: The back end involves intermediate code optimization and target code generation. It takes the IR from the front end, optimizes it for performance, and translates it into machine code for the target architecture. This phase focuses on improving efficiency and generating executable code.
正则表达式&NFA&DFA¶
Regular Expressions Regular expressions (regex[ˈreɡeks]) are sequences of characters that define a search pattern. They are used for pattern matching within strings. Regular expressions are widely used in search engines, text editors, and programming languages. DFA (Deterministic Finite Automaton) A Deterministic[dɪˌtɜːmɪˈnɪstɪk] Finite[ˈfaɪnaɪt] Automaton (DFA) has a finite number of states and transitions between them based on the input symbol. Each state has exactly one transition for each possible input. NFA (Nondeterministic Finite Automaton) A Nondeterministic Finite Automaton (NFA) is similar to a DFA but allows for multiple transitions for a single input symbol. NFAs are also used to recognize regular languages. Although NFAs can be more complex due to their multiple transitions, they can be converted to equivalent DFAs.
上下文无关文法¶
A Context-Free Grammar (CFG) is a type of formal grammar used to define the syntax of programming languages and natural languages. In CFG, production rules are used to generate strings from a start symbol. Key features: - Production Rules: Each rule defines how a non-terminal symbol can be replaced by a combination of terminal and non-terminal symbols. - Start Symbol: A special non-terminal symbol from which generation begins. - Terminals: The basic symbols from which strings are formed. - Non-Terminals: Symbols that can be replaced using production rules.
计算机组成原理¶
Cache¶
A cache is a small, high-speed storage area close to the CPU that stores frequently accessed data and instructions. The main purpose of a cache is to speed up data access by reducing the time it takes to access information from the main memory. When the CPU needs data, it first checks if the data is in the cache. If the data is found, it is quickly. If the data is not found, it is searched from the slower main memory, and a copy is placed in the cache for future use.
虚拟内存机制¶
Virtual memory is a system that allows a computer to use more memory than what is physically available . It helps run large programs and manage multiple applications at the same time. Paging: - Memory is divided into fixed-size blocks called pages. - The operating system keeps track of all pages. - By using a page table, the mapping between virtual memory and physical memory can be achieved. Segmentation: - Memory is divided into variable-sized segments, each representing[ˌreprɪˈzentɪŋ] a logical part of the program, like code, data, or stack. - Each segment has a base address and a length. - Segmentation can be used for functional partitioning[pɑːrˈtɪʃənɪŋ] and access control.
程序设计¶
面向对象特点¶
- Encapsulation is the concept of wrapping data and methods that work on the data within a single unit. It helps to protect the data from outside interference and misuse.
- Inheritance allows a new class to inherit properties and methods from an existing class. This helps in code reusability.
- Polymorphism[ˌpɒlɪˈmɔːfɪz(ə)m] allows objects of different classes to be treated as objects of a common superclass. It is achieved through method overriding and interfaces. allowing for dynamic method invocation[ˌɪnvəˈkeɪʃn]. These three concepts are key principles of Object-Oriented Programming.